Methods and compositions for homology-directed repair of cas endonuclease mediated double strand breaks

ABSTRACT

Compositions and methods are provided for effecting homology-directed repair at the site of a double-strand-break repair in a polynucleotide. In some aspects, the frequency of HDR is improved as compared to a control method. In some aspects, the ratio of HR to NHEJ is increased. In some aspects, the percentage of HR relative to the total number of mutant reads is increased.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a 371 National Stage Entry of PCT Patent ApplicationNo. PCT/US2019/031023 filed on 7 May 2019, which claims the benefit ofU.S. Provisional Patent Application Ser. No. 62/667,968 filed 7 May 2018and U.S. Provisional Patent Application Ser. No. 62/751,845 filed 29Oct. 2018, all of which are herein incorporated by reference in theirentireties.

REFERENCE TO SEQUENCE LISTING SUBMITTED ELECTRONICALLY

The official copy of the sequence listing is submitted electronicallyvia EFS-Web as an ASCII formatted sequence listing with a file named7664USPCT_SequenceListing_ST25.txt created on 15 Oct. 2020 and having asize of 20,147 bytes and is filed concurrently with the specification.The sequence listing comprised in this ASCII formatted document is partof the specification and is herein incorporated by reference in itsentirety.

FIELD OF THE INVENTION

The disclosure relates to the field of molecular biology, in particularto compositions of guide polynucleotide/endonuclease systems, andcompositions and methods for modifying the genome of a cell.

BACKGROUND

Recombinant DNA technology has made it possible to insert DNA sequencesat targeted genomic locations and/or modify specific endogenouschromosomal sequences. Site-specific integration techniques, whichemploy site-specific recombination systems, as well as other types ofrecombination technologies, have been used to generate targetedinsertions of genes of interest in a variety of organism. Genome-editingtechniques such as designer zinc finger nucleases (ZFNs), transcriptionactivator-like effector nucleases (TALENs), or homing meganucleases, areavailable for producing targeted genome perturbations, but these systemstend to have low specificity and employ designed nucleases that need tobe redesigned for each target site, which renders them costly andtime-consuming to prepare.

Newer technologies utilizing archaeal or bacterial adaptive immunitysystems have been identified, called CRISPR (Clustered RegularlyInterspaced Short Palindromic Repeats), which comprise different domainsof effector proteins that encompass a variety of activities (DNArecognition, binding, and optionally cleavage).

Despite the identification and characterization of some of thesesystems, there remains a need for methods and compositions for theimproving the frequency of homology-directed repair ofdouble-strand-break sites.

SUMMARY

Methods and compositions are provided for improving the probability(frequency) of homology-directed repair (HDR) of a double-strand-breakat a target site.

In some aspects, HDR of a target site DSB (DSB) is promoted by recurrentcutting (RC) of the target site. Recurrent cutting may be accomplishedby creating a first double-strand break with a DSB agent, such as a Casendonuclease, at the target site, and allowing it to repair via the NHEJpathway. The repaired double-strand break is then provided an RNA-guidedCas endonuclease, wherein the guide RNA comprises a sequence that iscapable of hybridizing with the repaired DSB, which becomes a new“target sequence”. A second double-strand break is create by the Casendonuclease. In some aspects, a single recurrent cut is performed. Insome aspects, recursive iterations of recurrent cuts are performed, e.g.a subsequent DSB is created at the newly repaired previously DSB, thatbecomes a new target site. In some aspects, three, four, five, six, ormore than six recurrent cuts are performed in such a recursive fashion.In any aspect, the DSB agent is a Cas endonuclease, for example, a Cas9endonuclease. It is contemplated that any Cas endonuclease that cleavesa target polynucleotide within the recognition site may be used for anyof the recurrent or recursive methods described herein. In principle,any DSB agent may be used.

In some aspects, HDR of a target site DSB is promoted by the creation ofone or more nick sites adjacent to the double-strand break of the targetsite. In some aspects, this is accomplished by using one or multiplesingle-strand endonucleases targeted to the flanking regions of thedouble-strand break in a strand specific manner. In some aspects, thesingle-strand endonuclease(s) must be heterologous to the double-strandendonuclease when using RNA guides for targeting. In some aspects, anon-RNA guided single-strand endonuclease may be designed to target theflanking sequences. In some aspects, two nicks are created, each oneflanking one side of the target site or the DSB. In some aspects, theDSB agent is a Cas endonuclease, for example, a Cas9 endonuclease. Insome aspects, the nickase is a Cas endonuclease that has been modifiedto retain single strand nicking activity but lacks double strandbreaking activity. In principle, any DSB agent in combination with anynickase agent may be used. The distance between a nick and the DSB is atleast 20, between 20 and 30, at least 30, between 30 and 40, at least40, between 40 and 50, at least 50, between 50 and 60, at least 60,between 60 and 70, at least 70, between 70 and 80, at least 80, between80 and 90, at least 90, between 90 and 100, at least 100, between 100and 125, at least 125, between 125 and 150, at least 150, between 150and 175, at least 175, between 175 and 200, at least 200, or evengreater than 200 base pairs.

In some aspects, the method further includes providing to thedouble-strand break a heterologous polynucleotide that comprises a DNAsequence, optionally flanked by polynucleotides sharing homology tosequences flanking the target site. In some aspects, the DNA sequence isfurther flanked by polynucleotides sharing homology to the target sitesequence. Alternatively, or in addition to, in some aspects, the DNAsequences is further flanked by two polynucleotides, wherein each of thetwo polynucleotides shares homology to a mutation created by a repair ofthe double-strand break. The mutation may be one of the NHEJ mutationsthat arise after the initial repair of the DSB, for example the twopolynucleotides may each share homology to the first most prevalent,second most prevalent, or any other NHEJ mutation that arises after theinitial repair of the DSB. In some aspects, a plurality of DNA sequencesthat comprise different flanking homology regions maybe provided. Saidplurality may be DNA sequences that are identical to each other and toeither the original target site or to new target sites created by DSBrepair, or said plurality may be a population of different DNAsequences, each sharing homology with a combination of any of theoriginal target site and/or one or more new target sites created bysubsequent DSB creation and repair.

In some aspects, the heterologous polynucleotide that comprises a DNAsequence is a “donor” DNA molecule that becomes inserted at the site ofa DSB. In some aspects, the heterologous polynucleotide that comprises aDNA sequence is a “modification template” that provides the basis fortemplate-directed repair of a DSB.

Any target polynucleotide may be used in any aspect of the methoddescribed herein. In some embodiments, the target polynucleotide is inan in vitro setting. In some embodiments, the target polynucleotide isin a cell, for example in the genome of a cell. In some embodiments, thecell is an animal cell, a fungal cell, a bacterial cell, or a plantcell.

In some aspects, a method is provided for obtaining a multicellularorganism, tissue, or part from a cell that has had its genome modifiedby any of the methods or compositions described herein, wherein themulticellular organism, tissue, or part retains the modification.

BRIEF DESCRIPTION OF THE DRAWINGS AND THE SEQUENCE LISTING

The disclosure can be more fully understood from the following detaileddescription and the accompanying drawings and Sequence Listing, whichform a part of this application.

FIG. 1 depicts some examples of DNA expression cassettes and circulardouble-stranded plasmid DNA molecules, that may be a donor DNA moleculefor insertion at a DSB and/or a template used for repair (“Donor/RepairTemplate”, “DRT”) that may be used for recurrent cleavage of a targetpolynucleotide to improve the frequency of HDR. FIG. 1A depicts a linearsingle-stranded DNA flanked by homology regions. FIG. 1B depicts alinear double-stranded DNA flanked by homology regions. FIG. 1C depictsa linear double-stranded DNA flanked by homology regions and furtherflanked by sequences sharing homology with the target site. FIG. 1Ddepicts a circular double-stranded DNA flanked by homology regions. FIG.1E depicts a circular double-stranded DNA flanked by both homologyregions and further flanked by sequences sharing homology with thetarget site.

FIG. 2A shows one variation of the recurrent cleavage method forimproving HDR of a target polynucleotide DSB. FIG. 2B shows onevariation of the recursive cleavage method for improving HDR of a targetpolynucleotide DSB.

FIG. 3 depicts a schematic of the break-nick method for improving HDR ofa target polynucleotide DSB, with one double-strand break created andtwo flanking nicks created on either side of the DSB.

FIG. 4A depicts target sites of the nickase SpynCas9 guides near theMS45-BLAT2 target sequence, labeled 1, 2, 3, and 4, respectively. FIGS.4B and 4C show the frequencies of mutations at the target site when:BLATCas9 and SpynCas9 were delivered but no guides were provided (noguide); when BLAT and its cognate guide were used by itself (BLAT only);and when BLAT and its cognate guide was co-delivered with SpynCas9 andspecified Spy guides. Mutant reads for each are shown in FIG. 4B and HDRfrequencies for each are shown in FIG. 4C. FIG. 4D depicts target sitesof the nickase SpynCas9 guides near the MS45 target sequence, labeled 3,4, and 5, respectively. FIGS. 4E and 4F show the frequencies ofmutations at the target site for: BlatCas9 and nSpynCas9 were deliveredbut no guides were provided (no guide); when Blat and its cognate guidewere used by itself (Blat only); when Blat and its cognate guide wereco-delivered with nickase and specified paired nick site Spyn guides(5/3 and 5/4 pairs); when SpynCas9 nickase and nick site guides (5/3pair) and were delivered without Blat or the Blat guide; and whenBlatCas9, Blat guide, and SpynGuides were delivered withoutco-delivering the nickase Spyn Cas9. Mutant reads for each are shown inFIG. 4E and HDR frequencies for each are shown in FIG. 4F. The targetsite is given as SEQID NO:73. The MS45 BLAT2 sgRNA guide is given asSEQID NO:74.

FIG. 5 depicts a “Blat-Spy-Blat” nickase-DSB agent-nickase strategy(FIG. 5A) performed at maize target site M16 in one germplasm line, forfive replicates. The M16 sgRNA guide sequence is given as SEQID NO:77.Average mutant reads for are shown in FIG. 5B and HDR frequencies shownin FIG. 5C, for: SpyCas9 delivered with no guides; SpyCas9 deliveredwith its cognate gRNA; SpyCas9, its cognate gRNA, nickase Blat and itsguides; and Blat nickase with its guides only, with no SpyCas9 or itscognate gRNA. The Blat-Spy-Blat strategy (FIG. 5D) was also performed ata different maize target site (NLB-CR8) in a different germplasm line,for five replicates. The NLB_8 sgRNA guide sequence is given as SEQIDNO:78. Average mutant reads for are shown in FIG. 5E and HDR frequenciesshown in FIG. 5F, for: SpyCas9 delivered with no guides; SpyCas9delivered with its cognate gRNA; SpyCas9, its cognate gRNA, nickase Blatand its guides; and Blat nickase with its guides only, with no SpyCas9or its cognate gRNA.

FIG. 6 depicts different strategies with S. pyogenes Cas9 as either anickase or DSB agent, and S. aureus Cas9 as either a DSB agent ornickase, respectively. A “Spy-Sa-Spy” strategy (FIG. 6A) was performedat the M545 maize genomic target site, for four replicates. The MS45 SasgRNA guide sequence is given as SEQID NO:75. Average mutant reads forare shown in FIG. 6B and HDR frequencies shown in FIG. 6C, for: SaCas9delivered with no guides; SaCas9 delivered with its cognate gRNA;SaCas9, its cognate gRNA, nickase SpynCas9 and its guides; and SpynCas9nickase with its guides only, with no SaCas9 or its cognate gRNA. A“Sa-Spy-Sa” strategy (FIG. 6D) was performed at the TS50 maize genomictarget site, for three replicates. The TS50 sgRNA guide sequence isgiven as SEQID NO:76. Average mutant reads for are shown in FIG. 6E andHDR frequencies shown in FIG. 6F, for: SpyCas9 delivered with no guides;SpyCas9 delivered with its cognate gRNA; SpyCas9, its cognate gRNA,nickase SaCas9 and its guides; and SaCas9 nickase with its guides only,with no SpyCas9 or its cognate gRNA. The “Sa-Spy-Sa” strategy (The“Sa-Spy-Sa” strategy (FIG. 6G) was performed at a different maizegenomic target site (TS45), for two replicates. Average mutant reads forare shown in FIG. 6H and HDR frequencies shown in FIG. 6I, for: SpyCas9delivered with no guides; SpyCas9 delivered with its cognate gRNA;SpyCas9, its cognate gRNA, nickase SaCas9 and its guides; and SaCas9nickase with its guides only, with no SpyCas9 or its cognate gRNA.

FIG. 7 shows the top 5 most prevalent NHEJ mutations for the Zea maysLIG1 target site (FIG. 7A), MS26 target site (FIG. 7B), MS45 target site(FIG. 7C), and TS45 target site (FIG. 7D). Streptococcus pyogenes Cas9targets are underlined with the protospacer adjacent motif (PAM) beingunderlined with a solid line and the guide RNA target being underlinedwith a dashed line. Target sites targeted for cleavage by recurrentCas9, the initial target, NHEJ mutation 1, NHEJ mutation 2, are boxed.

FIG. 8 shows the HDR frequencies with and without recurrent Cas9cleavage for the Zea mays LIG1 target site. % HDR was increased oversingle cutting at the initial target (iTarget) when both the iTarget andthe first most prevalent NHEJ mutation (NHEJ1) were both targeted forcleavage. % HDR was increased over single cutting at the initial target(iTarget) when both the iTarget and the first and second most prevalentNHEJ mutations (NHEJ1 and NHEJ2) were targeted for cleavage. Relative toexperiments targeting only the iTarget (and without flanking DNAtargets), the fold increase in HDR was nearly 20-fold.

FIG. 9 shows a recurrent cutting (RC) and template-directed repair HDRstrategy for a template flanked by both homology regions and sequencessharing homology to the initial target (iTarget) site (FIG. 9A), for atemplate flanked by both homology regions and sequences sharing homologyto the first most prevalent NHEJ mutation (NHEJ1) (FIG. 9B), and for atemplate flanked by both homology regions and sequences sharing nohomology to either the iTarget or to the TS first most prevalentmutation, but instead to site with no homology to the iTarget (FIG. 9C).

FIG. 10 shows that flanking the DRT with a target site increased thefrequency of HDR and frequencies were further elevated using RC Cas9.Notably, HDR outcomes were the highest when RC Cas9 and a DRT withpartially homologous flanking target sites were used.

FIG. 11 shows the HDR frequencies with recurrent cutting (RC) anddifferent flanking targets for the Zea mays MS26 target site.

FIG. 12 shows the HDR frequencies with recurrent cutting (RC) anddifferent flanking targets for the Zea mays MS45 target site.

FIG. 13 shows the averages of HDR frequencies with RC Cas9 and differentflanking targets across all target sites tested. On average, RC Cas9with a repair template flanked by a site partially homologous to theiTarget provided a 28-fold increase in HDR frequencies relative toexperiments when only the iTarget (without DRT flanking sequences) wascleaved. Also, as observed at the LIG1 target, flanking targets thatcontained homology with the iTarget (either complete or partialhomology) enhanced HDR with the highest frequencies being recovered fortargets with partial homology (NHEJ1) to the iTarget. To determinestatistically probabilities among treatments, a one-side T-test wasperformed assuming 95% confidence. Those groups with a probability (p)value less than 0.05 were statistically different.

FIG. 14 shows a PCR-based method for detecting HDR (FIG. 14A) and anagarose gel depicting the frequency of transformed callus tissuepositive for HDR with results from the iTarget only and no sitesflanking the DRT (FIG. 14B) and for the recurrent cutting Cas9experiment with the NHEJ1 flanking the DRT (FIG. 14C) being shown. Forthe recurrent cutting experiment, the MS26 HDR edit was detectable inalmost all of the transformed tissues as compared to only a fewinstances in the control.

FIG. 15 shows the percentage of regenerated plantlets with HDR fromcleavage (iTarget only) or recurrent cleavage (with NHEJ1 flanking theDRT) at two different target sites in maize (LIG1 and MS26). RCsignificantly improves the HDR outcome not only in the germline, butalso increased the frequency of bi-allelic HDR.

FIG. 16 shows the strategy for an RC approach for the insertion oflarger DNA fragments (e.g. transgenes).

FIG. 17 shows HDR frequencies for single cleavage without flankingtargets (iTarget only, No Sites Flanking the DRT) and with flankingtargets (iTarget only, iTarget Flanking the DRT), as compared to RC withNHEJ1 flanking the DRT. When taking into account plants with both mono-and bi-allelic HDR, the improvement to HDR was 5.2- and 8.2-fold overthe control for iTarget only (with iTargets flanking the DRT) and RC(with NHEJ1 targets flanking the DRT) treatments, respectively. Similarto that observed at the LIG1 and MS26 sites, the proportion of HDR editsfor RC Cas9 that were determined to be bi-allelic was significantlyenhanced compared to the other treatments (3.3% for RC Cas9, 0.7% foriTarget (with DRTs with iTarget sites), and 0.0% for iTarget (withoutsites flanking the DRT). Moreover, the proportion of clean plants,defined as having no other DNA integrated (e.g. Cas9, sgRNA, BBM, Wus)except the NPTII expression cassette at the target site, was alsoelevated for RC Cas9.

The sequence descriptions and sequence listing attached hereto complywith the rules governing nucleotide and amino acid sequence disclosuresin patent applications as set forth in 37 C.F.R. §§ 1.821 and 1.825. Thesequence descriptions comprise the three letter codes for amino acids asdefined in 37 C.F.R. §§ 1.821 and 1.825, which are incorporated hereinby reference.

-   -   SEQID NO:1 is the Zea mays DNA sequence for the LIG1 sgRNA        Target.    -   SEQID NO:2 is the Zea mays DNA sequence for the MS26 sgRNA        Target.    -   SEQID NO:3 is the Zea mays DNA sequence for the MS45 sgRNA        Target.    -   SEQID NO:4 is the Zea mays DNA sequence for the TS45 sgRNA        Target.    -   SEQID NO:5 is the Zea mays DNA sequence for the LIG1 Target        Region.    -   SEQID NO:6 is the Zea mays DNA sequence for the LIG1 NHEJ        Mutation 1.    -   SEQID NO:7 is the Zea mays DNA sequence for the LIG1 NHEJ        Mutation 2.    -   SEQID NO:8 is the Zea mays DNA sequence for the LIG1 NHEJ        Mutation 3.    -   SEQID NO:9 is the Zea mays DNA sequence for the LIG1 NHEJ        Mutation 4.    -   SEQID NO:10 is the Zea mays DNA sequence for the LIG1 NHEJ        Mutation 5.    -   SEQID NO:11 is the Zea mays DNA sequence for the MS26 Target        Region.    -   SEQID NO:12 is the Zea mays DNA sequence for the MS26 NHEJ        Mutation 1.    -   SEQID NO:13 is the Zea mays DNA sequence for the MS26 NHEJ        Mutation 2.    -   SEQID NO:14 is the Zea mays DNA sequence for the MS26 NHEJ        Mutation 3.    -   SEQID NO:15 is the Zea mays DNA sequence for the MS26 NHEJ        Mutation 4.    -   SEQID NO:16 is the Zea mays DNA sequence for the MS26 NHEJ        Mutation 5.    -   SEQID NO:17 is the Zea mays DNA sequence for the MS45 Target        Region.    -   SEQID NO:18 is the Zea mays DNA sequence for the MS45 NHEJ        Mutation 1.    -   SEQID NO:19 is the Zea mays DNA sequence for the MS45 NHEJ        Mutation 2.    -   SEQID NO:20 is the Zea mays DNA sequence for the MS45 NHEJ        Mutation 3.    -   SEQID NO:21 is the Zea mays DNA sequence for the MS45 NHEJ        Mutation 4.    -   SEQID NO:22 is the Zea mays DNA sequence for the MS45 NHEJ        Mutation 5.    -   SEQID NO:23 is the Zea mays DNA sequence for the TS45 Target        Region.    -   SEQID NO:24 is the Zea mays DNA sequence for the TS45 NHEJ        Mutation 1.    -   SEQID NO:25 is the Zea mays DNA sequence for the TS45 NHEJ        Mutation 2.    -   SEQID NO:26 is the Zea mays DNA sequence for the TS45 NHEJ        Mutation 3.    -   SEQID NO:27 is the Zea mays DNA sequence for the TS45 NHEJ        Mutation 4.    -   SEQID NO:28 is the Zea mays DNA sequence for the TS45 NHEJ        Mutation 5.    -   SEQID NO:29 is the Zea mays DNA sequence for the LIG1 NHEJ 1        sgRNA Target.    -   SEQID NO:30 is the Zea mays DNA sequence for the LIG1 NHEJ 2        sgRNA Target.    -   SEQID NO:31 is the Zea mays DNA sequence for the MS26 NHEJ 1        sgRNA Target.    -   SEQID NO:32 is the Zea mays DNA sequence for the MS26 NHEJ 2        sgRNA Target.    -   SEQID NO:33 is the Zea mays DNA sequence for the MS45 NHEJ 1        sgRNA Target.    -   SEQID NO:34 is the Zea mays DNA sequence for the MS45 NHEJ 2        sgRNA Target.    -   SEQID NO:35 is the Zea mays DNA sequence for the TS45 NHEJ 1        sgRNA Target.    -   SEQID NO:36 is the Zea mays DNA sequence for the TS45 NHEJ 2        sgRNA Target.    -   SEQID NO:37 is the Artificial DNA sequence for the        MS45-BLAT2-BN1m Sense.    -   SEQID NO:38 is the Artificial DNA sequence for the        MS45-BLAT2-BN1m AntiSense repair template.    -   SEQID NO:39 is the Artificial DNA sequence for the M16-BN-S        repair template.    -   SEQID NO:40 is the Artificial DNA sequence for the M16-BN-AS        repair template.    -   SEQID NO:41 is the Artificial DNA sequence for the NLB-CR8-BN-S        repair template.    -   SEQID NO:42 is the Artificial DNA sequence for the NLB-CR8-BN-AS        repair template.    -   SEQID NO:43 is the Artificial DNA sequence for the MS45-Sa-BN-S        repair template.    -   SEQID NO:44 is the Artificial DNA sequence for the MS45-Sa-BN-AS        repair template.    -   SEQID NO:45 is the Artificial DNA sequence for the TS50-BN-S        repair template.    -   SEQID NO:46 is the Artificial DNA sequence for the TS50-BN-AS        repair template.    -   SEQID NO:47 is the Artificial DNA sequence for the TS45_Sense        repair template.    -   SEQID NO:48 is the Artificial DNA sequence for the        TS45_Antisense repair template.    -   SEQID NO:49 is the Artificial DNA sequence for the        MS45-BLAT2-Spyn1 flanking Spy Guide for MS45 BLAT target.    -   SEQID NO:50 is the Artificial DNA sequence for the        MS45-BLAT2-Spyn2 flanking Spy Guide for MS45 BLAT target.    -   SEQID NO:51 is the Artificial DNA sequence for the        MS45-BLAT2-Spyn3 flanking Spy Guide for MS45 BLAT target.    -   SEQID NO:52 is the Artificial DNA sequence for the        MS45-BLAT2-Spyn4 flanking Spy Guide for MS45 BLAT target.    -   SEQID NO:53 is the Artificial DNA sequence for the        MS45-BLAT2-Spyn5 flanking Spy Guide for MS45 BLAT target.    -   SEQID NO:54 is the Artificial DNA sequence for the M16-BLAT-L        flanking BLAT Guide for named Spy sites.    -   SEQID NO:55 is the Artificial DNA sequence for the M16-BLAT-R        flanking BLAT Guide for named Spy sites.    -   SEQID NO:56 is the Artificial DNA sequence for the        NLB18_8_BLAT-L flanking BLAT Guide for named Spy sites.    -   SEQID NO:57 is the Artificial DNA sequence for the        NLB18_8_BLAT-R flanking BLAT Guide for named Spy sites.    -   SEQID NO:58 is the Artificial DNA sequence for the NLB_8_BLAT-L        flanking BLAT Guide for named Spy sites.    -   SEQID NO:59 is the Artificial DNA sequence for the NLB_8_BLAT-R        flanking BLAT Guide for named Spy sites.    -   SEQID NO:60 is the Artificial DNA sequence for the MS45-Spy-L        flanking Spy Guide for named Sa sites.    -   SEQID NO:61 is the Artificial DNA sequence for the MS45-Spy-R        flanking Spy Guide for named Sa sites.    -   SEQID NO:62 is the Artificial DNA sequence for the MS26-Spy-L        flanking Spy Guide for named Sa sites.    -   SEQID NO:63 is the Artificial DNA sequence for the MS26-Spy-R        flanking Spy Guide for named Sa sites.    -   SEQID NO:64 is the Artificial DNA sequence for the TS50-Sa-L        flanking Sa Guide for named Spy sites.    -   SEQID NO:65 is the Artificial DNA sequence for the TS50-Sa-R        flanking Sa Guide for named Spy sites.    -   SEQID NO:66 is the Artificial DNA sequence for the TS45-Sa-L        flanking Sa Guide for named Spy sites.    -   SEQID NO:67 is the Artificial DNA sequence for the TS45-Sa-R        flanking Sa Guide for named Spy sites.    -   SEQID NO:68 is the Zea mays DNA sequence for the M16 TS for        Blat-Spy-Blat nick-break-nick strategy.    -   SEQID NO:69 is the Zea mays DNA sequence for the NLB-CR8 TS for        Blat-Spy-Blat nick-break-nick strategy.    -   SEQID NO:70 is the Zea mays DNA sequence for the MS45 Sa TS for        Spy-Sa-Spy nick-break-nick strategy.    -   SEQID NO:71 is the Zea mays DNA sequence for the TS50 TS for        Sa-Spy-Sa nick-break-nick strategy.    -   SEQID NO:72 is the Zea mays DNA sequence for the TS45 TS for        Sa-Spy-Sa nick-break-nick strategy.    -   SEQID NO: 73 is the Zea mays DNA sequence for the MS-45 BLAT2 TS        for Spy-Blat-Spy nick-break-nick strategy.    -   SEQID NO: 74 is the Artificial DNA sequence for the MS45 BLAT2        sgRNA Guide.    -   SEQID NO: 75 is the Artificial DNA sequence for the MS45 Sa        sgRNA Guide.    -   SEQID NO: 76 is the Artificial DNA sequence for the TS50 sgRNA        Guide.    -   SEQID NO: 77 is the Artificial DNA sequence for the M16 sgRNA        Guide.    -   SEQID NO: 78 is the Artificial DNA sequence for the NLB_8 sgRNA        Guide.

DETAILED DESCRIPTION

Compositions and methods are provided for increasing the probability ofhomology-directed repair as the preferred outcome of double-strand-breakrepair of a polynucleotide.

Terms used in the claims and specification are defined as set forthbelow unless otherwise specified. It must be noted that, as used in thespecification and the appended claims, the singular forms “a,” “an” and“the” include plural referents unless the context clearly dictatesotherwise.

Definitions

As used herein, “nucleic acid” means a polynucleotide and includes asingle or a double-stranded polymer of deoxyribonucleotide orribonucleotide bases. Nucleic acids may also include fragments andmodified nucleotides. Thus, the terms “polynucleotide”, “nucleic acidsequence”, “nucleotide sequence” and “nucleic acid fragment” are usedinterchangeably to denote a polymer of RNA and/or DNA and/or RNA-DNAthat is single- or double-stranded, optionally comprising synthetic,non-natural, or altered nucleotide bases. Nucleotides (usually found intheir 5′-monophosphate form) are referred to by their single letterdesignation as follows: “A” for adenosine or deoxyadenosine (for RNA orDNA, respectively), “C” for cytosine or deoxycytosine, “G” for guanosineor deoxyguanosine, “U” for uridine, “T” for deoxythymidine, “R” forpurines (A or G), “Y” for pyrimidines (C or T), “K” for G or T, “H” forA or C or T, “I” for inosine, and “N” for any nucleotide.

The term “genome” as it applies to a prokaryotic and eukaryotic cell ororganism cells encompasses not only chromosomal DNA found within thenucleus, but organelle DNA found within subcellular components (e.g.,mitochondria, or plastid) of the cell.

“Open reading frame” is abbreviated ORF.

The term “selectively hybridizes” includes reference to hybridization,under stringent hybridization conditions, of a nucleic acid sequence toa specified nucleic acid target sequence to a detectably greater degree(e.g., at least 2-fold over background) than its hybridization tonon-target nucleic acid sequences and to the substantial exclusion ofnon-target nucleic acids. Selectively hybridizing sequences typicallyhave about at least 80% sequence identity, or 90% sequence identity, upto and including 100% sequence identity (i.e., fully complementary) witheach other.

The term “stringent conditions” or “stringent hybridization conditions”includes reference to conditions under which a probe will selectivelyhybridize to its target sequence in an in vitro hybridization assay.Stringent conditions are sequence-dependent and will be different indifferent circumstances. By controlling the stringency of thehybridization and/or washing conditions, target sequences can beidentified which are 100% complementary to the probe (homologousprobing). Alternatively, stringency conditions can be adjusted to allowsome mismatching in sequences so that lower degrees of similarity aredetected (heterologous probing). Generally, a probe is less than about1000 nucleotides in length, optionally less than 500 nucleotides inlength. Typically, stringent conditions will be those in which the saltconcentration is less than about 1.5 M Na ion, typically about 0.01 to1.0 M Na ion concentration (or other salt(s)) at pH 7.0 to 8.3, and atleast about 30° C. for short probes (e.g., 10 to 50 nucleotides) and atleast about 60° C. for long probes (e.g., greater than 50 nucleotides).Stringent conditions may also be achieved with the addition ofdestabilizing agents such as formamide. Exemplary low stringencyconditions include hybridization with a buffer solution of 30 to 35%formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulphate) at 37° C., and awash in 1× to 2×SSC (20×SSC=3.0 M NaCl/0.3 M trisodium citrate) at 50 to55° C. Exemplary moderate stringency conditions include hybridization in40 to 45% formamide, 1 M NaCl, 1% SDS at 37° C., and a wash in 0.5× to1×SSC at 55 to 60° C. Exemplary high stringency conditions includehybridization in 50% formamide, 1 M NaCl, 1% SDS at 37° C., and a washin 0.1×SSC at 60 to 65° C.

By “homology” is meant DNA sequences that are similar. For example, a“region of homology to a genomic region” that is found on the donor DNAis a region of DNA that has a similar sequence to a given “genomicregion” in the cell or organism genome. A region of homology can be ofany length that is sufficient to promote homologous recombination at thecleaved target site. For example, the region of homology can comprise atleast 5-10, 5-15, 5-20, 5-25, 5-30, 5-35, 5-40, 5-45, 5-50, 5-55, 5-60,5-65, 5-70, 5-75, 5-80, 5-85, 5-90, 5-95, 5-100, 5-200, 5-300, 5-400,5-500, 5-600, 5-700, 5-800, 5-900, 5-1000, 5-1100, 5-1200, 5-1300,5-1400, 5-1500, 5-1600, 5-1700, 5-1800, 5-1900, 5-2000, 5-2100, 5-2200,5-2300, 5-2400, 5-2500, 5-2600, 5-2700, 5-2800, 5-2900, 5-3000, 5-3100or more bases in length such that the region of homology has sufficienthomology to undergo homologous recombination with the correspondinggenomic region. “Sufficient homology” indicates that two polynucleotidesequences have sufficient structural similarity to act as substrates fora homologous recombination reaction. The structural similarity includesoverall length of each polynucleotide fragment, as well as the sequencesimilarity of the polynucleotides. Sequence similarity can be describedby the percent sequence identity over the whole length of the sequences,and/or by conserved regions comprising localized similarities such ascontiguous nucleotides having 100% sequence identity, and percentsequence identity over a portion of the length of the sequences.

As used herein, a “genomic region” is a segment of a chromosome in thegenome of a cell that is present on either side of the target site or,alternatively, also comprises a portion of the target site. The genomicregion can comprise at least 5-10, 5-15, 5-20, 5-25, 5-30, 5-35, 5-40,5-45, 5-50, 5-55, 5-60, 5-65, 5-70, 5-75, 5-80, 5-85, 5-90, 5-95, 5-100,5-200, 5-300, 5-400, 5-500, 5-600, 5-700, 5-800, 5-900, 5-1000, 5-1100,5-1200, 5-1300, 5-1400, 5-1500, 5-1600, 5-1700, 5-1800, 5-1900, 5-2000,5-2100, 5-2200, 5-2300, 5-2400, 5-2500, 5-2600, 5-2700, 5-2800. 5-2900,5-3000, 5-3100 or more bases such that the genomic region has sufficienthomology to undergo homologous recombination with the correspondingregion of homology.

As used herein, “homologous recombination” (HR) includes the exchange ofDNA fragments between two DNA molecules at the sites of homology. Thefrequency of homologous recombination is influenced by a number offactors. Different organisms vary with respect to the amount ofhomologous recombination and the relative proportion of homologous tonon-homologous recombination. Generally, the length of the region ofhomology affects the frequency of homologous recombination events: thelonger the region of homology, the greater the frequency. The length ofthe homology region needed to observe homologous recombination is alsospecies-variable. In many cases, at least 5 kb of homology has beenutilized, but homologous recombination has been observed with as littleas 25-50 bp of homology. See, for example, Singer et al., (1982) Cell31:25-33; Shen and Huang, (1986) Genetics 112:441-57; Watt et al.,(1985) Proc. Natl. Acad. Sci. USA 82:4768-72, Sugawara and Haber, (1992)Mol Cell Biol 12:563-75, Rubnitz and Subramani, (1984) Mol Cell Biol4:2253-8; Ayares et al., (1986) Proc. Natl. Acad. Sci. USA 83:5199-203;Liskay et al., (1987) Genetics 115:161-7.

“Sequence identity” or “identity” in the context of nucleic acid orpolypeptide sequences refers to the nucleic acid bases or amino acidresidues in two sequences that are the same when aligned for maximumcorrespondence over a specified comparison window.

The term “percentage of sequence identity” refers to the valuedetermined by comparing two optimally aligned sequences over acomparison window, wherein the portion of the polynucleotide orpolypeptide sequence in the comparison window may comprise additions ordeletions (i.e., gaps) as compared to the reference sequence (which doesnot comprise additions or deletions) for optimal alignment of the twosequences. The percentage is calculated by determining the number ofpositions at which the identical nucleic acid base or amino acid residueoccurs in both sequences to yield the number of matched positions,dividing the number of matched positions by the total number ofpositions in the window of comparison and multiplying the results by 100to yield the percentage of sequence identity. Useful examples of percentsequence identities include, but are not limited to, 50%, 55%, 60%, 65%,70%, 75%, 80%, 85%, 90%, or 95%, or any percentage from 50% to 100%.These identities can be determined using any of the programs describedherein.

Sequence alignments and percent identity or similarity calculations maybe determined using a variety of comparison methods designed to detecthomologous sequences including, but not limited to, the MegAlign™program of the LASERGENE bioinformatics computing suite (DNASTAR Inc.,Madison, Wis.). Within the context of this application it will beunderstood that where sequence analysis software is used for analysis,that the results of the analysis will be based on the “default values”of the program referenced, unless otherwise specified. As used herein“default values” will mean any set of values or parameters thatoriginally load with the software when first initialized.

The “Clustal V method of alignment” corresponds to the alignment methodlabeled Clustal V (described by Higgins and Sharp, (1989) CABIOS5:151-153; Higgins et al., (1992) Comput Appl Biosci 8:189-191) andfound in the MegAlign™ program of the LASERGENE bioinformatics computingsuite (DNASTAR Inc., Madison, Wis.). For multiple alignments, thedefault values correspond to GAP PENALTY=10 and GAP LENGTH PENALTY=10.Default parameters for pairwise alignments and calculation of percentidentity of protein sequences using the Clustal method are KTUPLE=1, GAPPENALTY=3, WINDOW=S and DIAGONALS SAVED=5. For nucleic acids theseparameters are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4.After alignment of the sequences using the Clustal V program, it ispossible to obtain a “percent identity” by viewing the “sequencedistances” Table in the same program. The “Clustal W method ofalignment” corresponds to the alignment method labeled Clustal W(described by Higgins and Sharp, (1989) CABIOS 5:151-153; Higgins etal., (1992) Comput Appl Biosci 8:189-191) and found in the MegAlign™v6.1 program of the LASERGENE bioinformatics computing suite (DNASTARInc., Madison, Wis.). Default parameters for multiple alignment (GAPPENALTY=10, GAP LENGTH PENALTY=0.2, Delay Divergen Seqs (%)=30, DNATransition Weight=0.5, Protein Weight Matrix=Gonnet Series, DNA WeightMatrix=IUB). After alignment of the sequences using the Clustal Wprogram, it is possible to obtain a “percent identity” by viewing the“sequence distances” Table in the same program. Unless otherwise stated,sequence identity/similarity values provided herein refer to the valueobtained using GAP Version 10 (GCG, Accelrys, San Diego, Calif.) usingthe following parameters: % identity and % similarity for a nucleotidesequence using a gap creation penalty weight of 50 and a gap lengthextension penalty weight of 3, and the nwsgapdna.cmp scoring matrix; %identity and % similarity for an amino acid sequence using a GAPcreation penalty weight of 8 and a gap length extension penalty of 2,and the BLOSUM62 scoring matrix (Henikoff and Henikoff, (1989) Proc.Natl. Acad. Sci. USA 89:10915). GAP uses the algorithm of Needleman andWunsch, (1970) J Mol Biol 48:443-53, to find an alignment of twocomplete sequences that maximizes the number of matches and minimizesthe number of gaps. GAP considers all possible alignments and gappositions and creates the alignment with the largest number of matchedbases and the fewest gaps, using a gap creation penalty and a gapextension penalty in units of matched bases. “BLAST” is a searchingalgorithm provided by the National Center for Biotechnology Information(NCBI) used to find regions of similarity between biological sequences.The program compares nucleotide or protein sequences to sequencedatabases and calculates the statistical significance of matches toidentify sequences having sufficient similarity to a query sequence suchthat the similarity would not be predicted to have occurred randomly.BLAST reports the identified sequences and their local alignment to thequery sequence. It is well understood by one skilled in the art thatmany levels of sequence identity are useful in identifying polypeptidesfrom other species or modified naturally or synthetically wherein suchpolypeptides have the same or similar function or activity. Usefulexamples of percent identities include, but are not limited to, 50%,55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95%, or any percentage from50% to 100%. Indeed, any amino acid identity from 50% to 100% may beuseful in describing the present disclosure, such as 51%, 52%, 53%, 54%,55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%,69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%,83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,97%, 98% or 99%.

Polynucleotide and polypeptide sequences, variants thereof, and thestructural relationships of these sequences can be described by theterms “homology”, “homologous”, “substantially identical”,“substantially similar” and “corresponding substantially” which are usedinterchangeably herein. These refer to polypeptide or nucleic acidsequences wherein changes in one or more amino acids or nucleotide basesdo not affect the function of the molecule, such as the ability tomediate gene expression or to produce a certain phenotype. These termsalso refer to modification(s) of nucleic acid sequences that do notsubstantially alter the functional properties of the resulting nucleicacid relative to the initial, unmodified nucleic acid. Thesemodifications include deletion, substitution, and/or insertion of one ormore nucleotides in the nucleic acid fragment. Substantially similarnucleic acid sequences encompassed may be defined by their ability tohybridize (under moderately stringent conditions, e.g., 0.5×SSC, 0.1%SDS, 60° C.) with the sequences exemplified herein, or to any portion ofthe nucleotide sequences disclosed herein and which are functionallyequivalent to any of the nucleic acid sequences disclosed herein.Stringency conditions can be adjusted to screen for moderately similarfragments, such as homologous sequences from distantly relatedorganisms, to highly similar fragments, such as genes that duplicatefunctional enzymes from closely related organisms. Post-hybridizationwashes determine stringency conditions.

A “centimorgan” (cM) or “map unit” is the distance between twopolynucleotide sequences, linked genes, markers, target sites, loci, orany pair thereof, wherein 1% of the products of meiosis are recombinant.Thus, a centimorgan is equivalent to a distance equal to a 1% averagerecombination frequency between the two linked genes, markers, targetsites, loci, or any pair thereof.

An “isolated” or “purified” nucleic acid molecule, polynucleotide,polypeptide, or protein, or biologically active portion thereof, issubstantially or essentially free from components that normallyaccompany or interact with the polynucleotide or protein as found in itsnaturally occurring environment. Thus, an isolated or purifiedpolynucleotide or polypeptide or protein is substantially free of othercellular material, or culture medium when produced by recombinanttechniques, or substantially free of chemical precursors or otherchemicals when chemically synthesized. Optimally, an “isolated”polynucleotide is free of sequences (optimally protein encodingsequences) that naturally flank the polynucleotide (i.e., sequenceslocated at the 5′ and 3′ ends of the polynucleotide) in the genomic DNAof the organism from which the polynucleotide is derived. For example,in various embodiments, the isolated polynucleotide can contain lessthan about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleotidesequence that naturally flank the polynucleotide in genomic DNA of thecell from which the polynucleotide is derived. Isolated polynucleotidesmay be purified from a cell in which they naturally occur. Conventionalnucleic acid purification methods known to skilled artisans may be usedto obtain isolated polynucleotides. The term also embraces recombinantpolynucleotides and chemically synthesized polynucleotides.

The term “fragment” refers to a contiguous set of nucleotides or aminoacids. In one embodiment, a fragment is 2, 3, 4, 5, 6, 7 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19, 20, or greater than 20 contiguousnucleotides. In one embodiment, a fragment is 2, 3, 4, 5, 6, 7 8, 9, 10,11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or greater than 20 contiguousamino acids. A fragment may or may not exhibit the function of asequence sharing some percent identity over the length of said fragment.

The terms “fragment that is functionally equivalent” and “functionallyequivalent fragment” are used interchangeably herein. These terms referto a portion or subsequence of an isolated nucleic acid fragment orpolypeptide that displays the same activity or function as the longersequence from which it derives. In one example, the fragment retains theability to alter gene expression or produce a certain phenotype whetheror not the fragment encodes an active protein. For example, the fragmentcan be used in the design of genes to produce the desired phenotype in amodified plant. Genes can be designed for use in suppression by linkinga nucleic acid fragment, whether or not it encodes an active enzyme, inthe sense or antisense orientation relative to a plant promotersequence.

“Gene” includes a nucleic acid fragment that expresses a functionalmolecule such as, but not limited to, a specific protein, includingregulatory sequences preceding (5′ non-coding sequences) and following(3′ non-coding sequences) the coding sequence. “Native gene” refers to agene as found in its natural endogenous location with its own regulatorysequences.

By the term “endogenous” it is meant a sequence or other molecule thatnaturally occurs in a cell or organism. In one aspect, an endogenouspolynucleotide is normally found in the genome of a cell; that is, notheterologous.

An “allele” is one of several alternative forms of a gene occupying agiven locus on a chromosome. When all the alleles present at a givenlocus on a chromosome are the same, that plant is homozygous at thatlocus. If the alleles present at a given locus on a chromosome differ,that plant is heterozygous at that locus.

“Coding sequence” refers to a polynucleotide sequence which codes for aspecific amino acid sequence. “Regulatory sequences” refer to nucleotidesequences located upstream (5′ non-coding sequences), within, ordownstream (3′ non-coding sequences) of a coding sequence, and whichinfluence the transcription, RNA processing or stability, or translationof the associated coding sequence. Regulatory sequences include, but arenot limited to, promoters, translation leader sequences, 5′ untranslatedsequences, 3′ untranslated sequences, introns, polyadenylation targetsequences, RNA processing sites, effector binding sites, and stem-loopstructures.

A “mutated gene” is a gene that has been altered through humanintervention. Such a “mutated gene” has a sequence that differs from thesequence of the corresponding non-mutated gene by at least onenucleotide addition, deletion, or substitution. In certain embodimentsof the disclosure, the mutated gene comprises an alteration that resultsfrom a guide polynucleotide/Cas endonuclease system as disclosed herein.A mutated plant is a plant comprising a mutated gene.

As used herein, a “targeted mutation” is a mutation in a gene (referredto as the target gene), including a native gene, that was made byaltering a target sequence within the target gene using any method knownto one skilled in the art, including a method involving a guided Casendonuclease system as disclosed herein.

The terms “knock-out”, “gene knock-out” and “genetic knock-out” are usedinterchangeably herein. A knock-out represents a DNA sequence of a cellthat has been rendered partially or completely inoperative by targetingwith a Cas protein; for example, a DNA sequence prior to knock-out couldhave encoded an amino acid sequence, or could have had a regulatoryfunction (e.g., promoter).

The terms “knock-in”, “gene knock-in, “gene insertion” and “geneticknock-in” are used interchangeably herein. A knock-in represents thereplacement or insertion of a DNA sequence at a specific DNA sequence incell by targeting with a Cas protein (for example by homologousrecombination (HR), wherein a suitable donor DNA polynucleotide is alsoused). Examples of knock-ins are a specific insertion of a heterologousamino acid coding sequence in a coding region of a gene, or a specificinsertion of a transcriptional regulatory element in a genetic locus.

By “domain” it is meant a contiguous stretch of nucleotides (that can beRNA, DNA, and/or RNA-DNA-combination sequence) or amino acids.

The term “conserved domain” or “motif” means a set of polynucleotides oramino acids conserved at specific positions along an aligned sequence ofevolutionarily related proteins. While amino acids at other positionscan vary between homologous proteins, amino acids that are highlyconserved at specific positions indicate amino acids that are essentialto the structure, the stability, or the activity of a protein. Becausethey are identified by their high degree of conservation in alignedsequences of a family of protein homologues, they can be used asidentifiers, or “signatures”, to determine if a protein with a newlydetermined sequence belongs to a previously identified protein family.

A “codon-modified gene” or “codon-preferred gene” or “codon-optimizedgene” is a gene having its frequency of codon usage designed to mimicthe frequency of preferred codon usage of the host cell.

An “optimized” polynucleotide is a sequence that has been optimized forimproved expression in a particular heterologous host cell.

A “plant-optimized nucleotide sequence” is a nucleotide sequence thathas been optimized for expression in plants, particularly for increasedexpression in plants. A plant-optimized nucleotide sequence includes acodon-optimized gene. A plant-optimized nucleotide sequence can besynthesized by modifying a nucleotide sequence encoding a protein suchas, for example, a Cas endonuclease as disclosed herein, using one ormore plant-preferred codons for improved expression. See, for example,Campbell and Gowri (1990) Plant Physiol. 92:1-11 for a discussion ofhost-preferred codon usage.

A “promoter” is a region of DNA involved in recognition and binding ofRNA polymerase and other proteins to initiate transcription. Thepromoter sequence consists of proximal and more distal upstreamelements, the latter elements often referred to as enhancers. An“enhancer” is a DNA sequence that can stimulate promoter activity, andmay be an innate element of the promoter or a heterologous elementinserted to enhance the level or tissue-specificity of a promoter.Promoters may be derived in their entirety from a native gene, or becomposed of different elements derived from different promoters found innature, and/or comprise synthetic DNA segments. It is understood bythose skilled in the art that different promoters may direct theexpression of a gene in different tissues or cell types, or at differentstages of development, or in response to different environmentalconditions. It is further recognized that since in most cases the exactboundaries of regulatory sequences have not been completely defined, DNAfragments of some variation may have identical promoter activity.

Promoters that cause a gene to be expressed in most cell types at mosttimes are commonly referred to as “constitutive promoters”. The term“inducible promoter” refers to a promoter that selectively express acoding sequence or functional RNA in response to the presence of anendogenous or exogenous stimulus, for example by chemical compounds(chemical inducers) or in response to environmental, hormonal, chemical,and/or developmental signals. Inducible or regulated promoters include,for example, promoters induced or regulated by light, heat, stress,flooding or drought, salt stress, osmotic stress, phytohormones,wounding, or chemicals such as ethanol, abscisic acid (ABA), jasmonate,salicylic acid, or safeners.

“Translation leader sequence” refers to a polynucleotide sequencelocated between the promoter sequence of a gene and the coding sequence.The translation leader sequence is present in the mRNA upstream of thetranslation start sequence. The translation leader sequence may affectprocessing of the primary transcript to mRNA, mRNA stability ortranslation efficiency. Examples of translation leader sequences havebeen described (e.g., Turner and Foster, (1995) Mol Biotechnol3:225-236).

“3′ non-coding sequences”, “transcription terminator” or “terminationsequences” refer to DNA sequences located downstream of a codingsequence and include polyadenylation recognition sequences and othersequences encoding regulatory signals capable of affecting mRNAprocessing or gene expression. The polyadenylation signal is usuallycharacterized by affecting the addition of polyadenylic acid tracts tothe 3′ end of the mRNA precursor. The use of different 3′ non-codingsequences is exemplified by Ingelbrecht et al., (1989) Plant Cell1:671-680.

“RNA transcript” refers to the product resulting from RNApolymerase-catalyzed transcription of a DNA sequence. When the RNAtranscript is a perfect complimentary copy of the DNA sequence, it isreferred to as the primary transcript or pre-mRNA. An RNA transcript isreferred to as the mature RNA or mRNA when it is an RNA sequence derivedfrom post-transcriptional processing of the primary transcript pre-mRNA.“Messenger RNA” or “mRNA” refers to the RNA that is without introns andthat can be translated into protein by the cell. “cDNA” refers to a DNAthat is complementary to, and synthesized from, an mRNA template usingthe enzyme reverse transcriptase. The cDNA can be single-stranded orconverted into double-stranded form using the Klenow fragment of DNApolymerase I. “Sense” RNA refers to RNA transcript that includes themRNA and can be translated into protein within a cell or in vitro.“Antisense RNA” refers to an RNA transcript that is complementary to allor part of a target primary transcript or mRNA, and that blocks theexpression of a target gene (see, e.g., U.S. Pat. No. 5,107,065). Thecomplementarity of an antisense RNA may be with any part of the specificgene transcript, i.e., at the 5′ non-coding sequence, 3′ non-codingsequence, introns, or the coding sequence. “Functional RNA” refers toantisense RNA, ribozyme RNA, or other RNA that may not be translated butyet has an effect on cellular processes. The terms “complement” and“reverse complement” are used interchangeably herein with respect tomRNA transcripts, and are meant to define the antisense RNA of themessage.

The term “genome” refers to the entire complement of genetic material(genes and non-coding sequences) that is present in each cell of anorganism, or virus or organelle; and/or a complete set of chromosomesinherited as a (haploid) unit from one parent.

The term “operably linked” refers to the association of nucleic acidsequences on a single nucleic acid fragment so that the function of oneis regulated by the other. For example, a promoter is operably linkedwith a coding sequence when it is capable of regulating the expressionof that coding sequence (i.e., the coding sequence is under thetranscriptional control of the promoter). Coding sequences can beoperably linked to regulatory sequences in a sense or antisenseorientation. In another example, the complementary RNA regions can beoperably linked, either directly or indirectly, 5′ to the target mRNA,or 3′ to the target mRNA, or within the target mRNA, or a firstcomplementary region is 5′ and its complement is 3′ to the target mRNA.

Generally, “host” refers to an organism or cell into which aheterologous component (polynucleotide, polypeptide, other molecule,cell) has been introduced. As used herein, a “host cell” refers to an invivo or in vitro eukaryotic cell, prokaryotic cell (e.g., bacterial orarchaeal cell), or cell from a multicellular organism (e.g., a cellline) cultured as a unicellular entity, into which a heterologouspolynucleotide or polypeptide has been introduced. In some embodiments,the cell is selected from the group consisting of: an archaeal cell, abacterial cell, a eukaryotic cell, a eukaryotic single-cell organism, asomatic cell, a germ cell, a stem cell, a plant cell, an algal cell, ananimal cell, in invertebrate cell, a vertebrate cell, a fish cell, afrog cell, a bird cell, an insect cell, a mammalian cell, a pig cell, acow cell, a goat cell, a sheep cell, a rodent cell, a rat cell, a mousecell, a non-human primate cell, and a human cell. In some cases, thecell is in vitro. In some cases, the cell is in vivo.

The term “recombinant” refers to an artificial combination of twootherwise separated segments of sequence, e.g., by chemical synthesis,or manipulation of isolated segments of nucleic acids by geneticengineering techniques.

The terms “plasmid”, “vector” and “cassette” refer to a linear orcircular extra chromosomal element often carrying genes that are notpart of the central metabolism of the cell, and usually in the form ofdouble-stranded DNA. Such elements may be autonomously replicatingsequences, genome integrating sequences, phage, or nucleotide sequences,in linear or circular form, of a single- or double-stranded DNA or RNA,derived from any source, in which a number of nucleotide sequences havebeen joined or recombined into a unique construction which is capable ofintroducing a polynucleotide of interest into a cell. “Transformationcassette” refers to a specific vector comprising a gene and havingelements in addition to the gene that facilitates transformation of aparticular host cell. “Expression cassette” refers to a specific vectorcomprising a gene and having elements in addition to the gene that allowfor expression of that gene in a host.

The terms “recombinant DNA molecule”, “recombinant DNA construct”,“expression construct”, “construct”, and “recombinant construct” areused interchangeably herein. A recombinant DNA construct comprises anartificial combination of nucleic acid sequences, e.g., regulatory andcoding sequences that are not all found together in nature. For example,a recombinant DNA construct may comprise regulatory sequences and codingsequences that are derived from different sources, or regulatorysequences and coding sequences derived from the same source, butarranged in a manner different than that found in nature. Such aconstruct may be used by itself or may be used in conjunction with avector. If a vector is used, then the choice of vector is dependent uponthe method that will be used to introduce the vector into the host cellsas is well known to those skilled in the art. For example, a plasmidvector can be used. The skilled artisan is well aware of the geneticelements that must be present on the vector in order to successfullytransform, select and propagate host cells. The skilled artisan willalso recognize that different independent transformation events mayresult in different levels and patterns of expression (Jones et al.,(1985) EMBO J 4:2411-2418; De Almeida et al., (1989) Mol Gen Genetics218:78-86), and thus that multiple events are typically screened inorder to obtain lines displaying the desired expression level andpattern. Such screening may be accomplished standard molecularbiological, biochemical, and other assays including Southern analysis ofDNA, Northern analysis of mRNA expression, PCR, real time quantitativePCR (qPCR), reverse transcription PCR (RT-PCR), immunoblotting analysisof protein expression, enzyme or activity assays, and/or phenotypicanalysis.

The term “heterologous” refers to the difference between the originalenvironment, location, or composition of a particular polynucleotide orpolypeptide sequence and its current environment, location, orcomposition. Non-limiting examples include differences in taxonomicderivation (e.g., a polynucleotide sequence obtained from Zea mays wouldbe heterologous if inserted into the genome of an Oryza sativa plant, orof a different variety or cultivar of Zea mays; or a polynucleotideobtained from a bacterium was introduced into a cell of a plant), orsequence (e.g., a polynucleotide sequence obtained from Zea mays,isolated, modified, and re-introduced into a maize plant). As usedherein, “heterologous” in reference to a sequence can refer to asequence that originates from a different species, variety, foreignspecies, or, if from the same species, is substantially modified fromits native form in composition and/or genomic locus by deliberate humanintervention. For example, a promoter operably linked to a heterologouspolynucleotide is from a species different from the species from whichthe polynucleotide was derived, or, if from the same/analogous species,one or both are substantially modified from their original form and/orgenomic locus, or the promoter is not the native promoter for theoperably linked polynucleotide. Alternatively, one or more regulatoryregion(s) and/or a polynucleotide provided herein may be entirelysynthetic.

The term “expression”, as used herein, refers to the production of afunctional end-product (e.g., an mRNA, guide RNA, or a protein) ineither precursor or mature form.

A “mature” protein refers to a post-translationally processedpolypeptide (i.e., one from which any pre- or propeptides present in theprimary translation product have been removed).

“Precursor” protein refers to the primary product of translation of mRNA(i.e., with pre- and propeptides still present). Pre- and propeptidesmay be but are not limited to intracellular localization signals.

“CRISPR” (Clustered Regularly Interspaced Short Palindromic Repeats)loci refers to certain genetic loci encoding components of DNA cleavagesystems, for example, used by bacterial and archaeal cells to destroyforeign DNA (Horvath and Barrangou, 2010, Science 327:167-170;WO2007025097, published 1 Mar. 2007). A CRISPR locus can consist of aCRISPR array, comprising short direct repeats (CRISPR repeats) separatedby short variable DNA sequences (called spacers), which can be flankedby diverse Cas (CRISPR-associated) genes.

As used herein, an “effector” or “effector protein” is a protein thatencompasses an activity including recognizing, binding to, and/orcleaving or nicking a polynucleotide target. An effector, or effectorprotein, may also be an endonuclease. The “effector complex” of a CRISPRsystem includes Cas proteins involved in crRNA and target recognitionand binding. Some of the component Cas proteins may additionallycomprise domains involved in target polynucleotide cleavage.

The term “Cas protein” refers to a polypeptide encoded by a Cas(CRISPR-associated) gene. A Cas protein includes but is not limited to:a Cas9 protein, a Cpf1 (Cas12) protein, a C2c1 protein, a C2c2 protein,a C2c3 protein, Cas3, Cas3-HD, Cas 5, Cas7, Cas8, Cas10, or combinationsor complexes of these. A Cas protein may be a “Cas endonuclease” or “Caseffector protein”, that when in complex with a suitable polynucleotidecomponent, is capable of recognizing, binding to, and optionally nickingor cleaving all or part of a specific polynucleotide target sequence. ACas endonuclease described herein comprises one or more nucleasedomains. The endonucleases of the disclosure may include those havingone or more RuvC nuclease domains. A Cas protein is further defined as afunctional fragment or functional variant of a native Cas protein, or aprotein that shares at least 50%, between 50% and 55%, at least 55%,between 55% and 60%, at least 60%, between 60% and 65%, at least 65%,between 65% and 70%, at least 70%, between 70% and 75%, at least 75%,between 75% and 80%, at least 80%, between 80% and 85%, at least 85%,between 85% and 90%, at least 90%, between 90% and 95%, at least 95%,between 95% and 96%, at least 96%, between 96% and 97%, at least 97%,between 97% and 98%, at least 98%, between 98% and 99%, at least 99%,between 99% and 100%, or 100% sequence identity with at least 50,between 50 and 100, at least 100, between 100 and 150, at least 150,between 150 and 200, at least 200, between 200 and 250, at least 250,between 250 and 300, at least 300, between 300 and 350, at least 350,between 350 and 400, at least 400, between 400 and 450, at least 500, orgreater than 500 contiguous amino acids of a native Cas protein, andretains at least partial activity.

A “Cas endonuclease” may comprise domains that enable it to function asa double-strand-break-inducing agent. A “Cas endonuclease” may alsocomprise one or more modifications or mutations that abolish or reduceits ability to cleave a double-strand polynucleotide (dCas). In someaspects, the Cas endonuclease molecule may retain the ability to nick asingle-strand polynucleotide (for example, a D10A mutation in a Cas9endonuclease molecule) (nCas9).

A “functional fragment”, “fragment that is functionally equivalent” and“functionally equivalent fragment” of a Cas endonuclease are usedinterchangeably herein, and refer to a portion or subsequence of the Casendonuclease of the present disclosure in which the ability torecognize, bind to, and optionally unwind, nick or cleave (introduce asingle or double-strand break in) the target site is retained. Theportion or subsequence of the Cas endonuclease can comprise a completeor partial (functional) peptide of any one of its domains such as forexample, but not limiting to a complete of functional part of a Cas3 HDdomain, a complete of functional part of a Cas3 Helicase domain,complete of functional part of a Cascade protein (such as but notlimiting to a Cas5, Cas5d, Cas7 and Cas8b1).

The terms “functional variant”, “variant that is functionallyequivalent” and “functionally equivalent variant” of a Cas endonucleaseor Cas effector protein are used interchangeably herein, and refer to avariant of the Cas effector protein disclosed herein in which theability to recognize, bind to, and optionally unwind, nick or cleave allor part of a target sequence is retained.

A Cas endonuclease may also include a multifunctional Cas endonuclease.The term “multifunctional Cas endonuclease” and “multifunctional Casendonuclease polypeptide” are used interchangeably herein and includesreference to a single polypeptide that has Cas endonucleasefunctionality (comprising at least one protein domain that can act as aCas endonuclease) and at least one other functionality, such as but notlimited to, the functionality to form a cascade (comprises at least asecond protein domain that can form a cascade with other proteins). Inone aspect, the multifunctional Cas endonuclease comprises at least oneadditional protein domain relative (either internally, upstream (5′),downstream (3′), or both internally 5′ and 3′, or any combinationthereof) to those domains typical of a Cas endonuclease.

The terms “cascade” and “cascade complex” are used interchangeablyherein and include reference to a multi-subunit protein complex that canassemble with a polynucleotide forming a polynucleotide-protein complex(PNP). Cascade is a PNP that relies on the polynucleotide for complexassembly and stability, and for the identification of target nucleicacid sequences. Cascade functions as a surveillance complex that findsand optionally binds target nucleic acids that are complementary to avariable targeting domain of the guide polynucleotide.

The terms “cleavage-ready Cascade”, “crCascade”, “cleavage-ready Cascadecomplex”, “crCascade complex”, “cleavage-ready Cascade system”, “CRC”and “crCascade system”, are used interchangeably herein and includereference to a multi-subunit protein complex that can assemble with apolynucleotide forming a polynucleotide-protein complex (PNP), whereinone of the cascade proteins is a Cas endonuclease capable ofrecognizing, binding to, and optionally unwinding, nicking, or cleavingall or part of a target sequence.

The terms “5′-cap” and “7-methylguanylate (m7G) cap” are usedinterchangeably herein. A 7-methylguanylate residue is located on the 5′terminus of messenger RNA (mRNA) in eukaryotes. RNA polymerase II (PolII) transcribes mRNA in eukaryotes. Messenger RNA capping occursgenerally as follows: The most terminal 5′ phosphate group of the mRNAtranscript is removed by RNA terminal phosphatase, leaving two terminalphosphates. A guanosine monophosphate (GMP) is added to the terminalphosphate of the transcript by a guanylyl transferase, leaving a 5′-5′triphosphate-linked guanine at the transcript terminus. Finally, the7-nitrogen of this terminal guanine is methylated by a methyltransferase.

The terminology “not having a 5′-cap” herein is used to refer to RNAhaving, for example, a 5′-hydroxyl group instead of a 5′-cap. Such RNAcan be referred to as “uncapped RNA”, for example. Uncapped RNA canbetter accumulate in the nucleus following transcription, since5′-capped RNA is subject to nuclear export. One or more RNA componentsherein are uncapped.

As used herein, the term “guide polynucleotide”, relates to apolynucleotide sequence that can form a complex with a Cas endonuclease,including the Cas endonuclease described herein, and enables the Casendonuclease to recognize, optionally bind to, and optionally cleave aDNA target site. The guide polynucleotide sequence can be an RNAsequence, a DNA sequence, or a combination thereof (an RNA-DNAcombination sequence).

The terms “functional fragment”, “fragment that is functionallyequivalent” and “functionally equivalent fragment” of a guide RNA, crRNAor tracrRNA are used interchangeably herein, and refer to a portion orsubsequence of the guide RNA, crRNA or tracrRNA, respectively, of thepresent disclosure in which the ability to function as a guide RNA,crRNA or tracrRNA, respectively, is retained.

The terms “functional variant”, “variant that is functionallyequivalent” and “functionally equivalent variant” of a guide RNA, crRNAor tracrRNA (respectively) are used interchangeably herein, and refer toa variant of the guide RNA, crRNA or tracrRNA, respectively, of thepresent disclosure in which the ability to function as a guide RNA,crRNA or tracrRNA, respectively, is retained.

The terms “single guide RNA” and “sgRNA” are used interchangeably hereinand relate to a synthetic fusion of two RNA molecules, a crRNA (CRISPRRNA) comprising a variable targeting domain (linked to a tracr matesequence that hybridizes to a tracrRNA), fused to a tracrRNA(trans-activating CRISPR RNA). The single guide RNA can comprise a crRNAor crRNA fragment and a tracrRNA or tracrRNA fragment of the type IICRISPR/Cas system that can form a complex with a type II Casendonuclease, wherein said guide RNA/Cas endonuclease complex can directthe Cas endonuclease to a DNA target site, enabling the Cas endonucleaseto recognize, optionally bind to, and optionally nick or cleave(introduce a single or double-strand break) the DNA target site.

The term “variable targeting domain” or “VT domain” is usedinterchangeably herein and includes a nucleotide sequence that canhybridize (is complementary) to one strand (nucleotide sequence) of adouble strand DNA target site. The percent complementation between thefirst nucleotide sequence domain (VT domain) and the target sequence canbe at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%,62%, 63%, 63%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%,76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%,90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%. The variabletargeting domain can be at least 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,22, 23, 24, 25, 26, 27, 28, 29 or 30 nucleotides in length. In someembodiments, the variable targeting domain comprises a contiguousstretch of 12 to 30 nucleotides. The variable targeting domain can becomposed of a DNA sequence, an RNA sequence, a modified DNA sequence, amodified RNA sequence, or any combination thereof.

The term “Cas endonuclease recognition domain” or “CER domain” (of aguide polynucleotide) is used interchangeably herein and includes anucleotide sequence that interacts with a Cas endonuclease polypeptide.A CER domain comprises a (trans-acting) tracrNucleotide mate sequencefollowed by a tracrNucleotide sequence. The CER domain can be composedof a DNA sequence, an RNA sequence, a modified DNA sequence, a modifiedRNA sequence (see for example US20150059010A1, published 26 Feb. 2015),or any combination thereof.

As used herein, the terms “guide polynucleotide/Cas endonucleasecomplex”, “guide polynucleotide/Cas endonuclease system”, “guidepolynucleotide/Cas complex”, “guide polynucleotide/Cas system” and“guided Cas system” “Polynucleotide-guided endonuclease”, “PGEN” areused interchangeably herein and refer to at least one guidepolynucleotide and at least one Cas endonuclease, that are capable offorming a complex, wherein said guide polynucleotide/Cas endonucleasecomplex can direct the Cas endonuclease to a DNA target site, enablingthe Cas endonuclease to recognize, bind to, and optionally nick orcleave (introduce a single or double-strand break) the DNA target site.A guide polynucleotide/Cas endonuclease complex herein can comprise Casprotein(s) and suitable polynucleotide component(s) of any of the knownCRISPR systems (Horvath and Barrangou, 2010, Science 327:167-170;Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15; Zetscheet al., 2015, Cell 163, 1-13; Shmakov et al., 2015, Molecular Cell 60,1-13).

The terms “guide RNA/Cas endonuclease complex”, “guide RNA/Casendonuclease system”, “guide RNA/Cas complex”, “guide RNA/Cas system”,“gRNA/Cas complex”, “gRNA/Cas system”, “RNA-guided endonuclease”, “RGEN”are used interchangeably herein and refer to at least one RNA componentand at least one Cas endonuclease that are capable of forming a complex,wherein said guide RNA/Cas endonuclease complex can direct the Casendonuclease to a DNA target site, enabling the Cas endonuclease torecognize, bind to, and optionally nick or cleave (introduce a single ordouble-strand break) the DNA target site.

The terms “target site”, “target sequence”, “target site sequence,“target DNA”, “target locus”, “genomic target site”, “genomic targetsequence”, “genomic target locus”, “target polynucleotide”, and“protospacer”, are used interchangeably herein and refer to apolynucleotide sequence such as, but not limited to, a nucleotidesequence on a chromosome, episome, a locus, or any other DNA molecule inthe genome (including chromosomal, chloroplastic, mitochondrial DNA,plasmid DNA) of a cell, at which a guide polynucleotide/Cas endonucleasecomplex can recognize, bind to, and optionally nick or cleave. Thetarget site can be an endogenous site in the genome of a cell, oralternatively, the target site can be heterologous to the cell andthereby not be naturally occurring in the genome of the cell, or thetarget site can be found in a heterologous genomic location compared towhere it occurs in nature. As used herein, terms “endogenous targetsequence” and “native target sequence” are used interchangeable hereinto refer to a target sequence that is endogenous or native to the genomeof a cell and is at the endogenous or native position of that targetsequence in the genome of the cell. An “artificial target site” or“artificial target sequence” are used interchangeably herein and referto a target sequence that has been introduced into the genome of a cell.Such an artificial target sequence can be identical in sequence to anendogenous or native target sequence in the genome of a cell but belocated in a different position (i.e., a non-endogenous or non-nativeposition) in the genome of a cell.

A “protospacer adjacent motif” (PAM) herein refers to a short nucleotidesequence adjacent to a target sequence (protospacer) that is recognized(targeted) by a guide polynucleotide/Cas endonuclease system describedherein. The Cas endonuclease may not successfully recognize a target DNAsequence if the target DNA sequence is not followed by a PAM sequence.The sequence and length of a PAM herein can differ depending on the Casprotein or Cas protein complex used. The PAM sequence can be of anylength but is typically 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,15, 16, 17, 18, 19 or 20 nucleotides long.

An “altered target site”, “altered target sequence”, “modified targetsite”, “modified target sequence” are used interchangeably herein andrefer to a target sequence as disclosed herein that comprises at leastone alteration when compared to non-altered target sequence. Such“alterations” include, for example: (i) replacement of at least onenucleotide, (ii) a deletion of at least one nucleotide, (iii) aninsertion of at least one nucleotide, or (iv) any combination of(i)-(iii).

A “modified nucleotide” or “edited nucleotide” refers to a nucleotidesequence of interest that comprises at least one alteration whencompared to its non-modified nucleotide sequence. Such “alterations”include, for example: (i) replacement of at least one nucleotide, (ii) adeletion of at least one nucleotide, (iii) an insertion of at least onenucleotide, or (iv) any combination of (i)-(iii).

Methods for “modifying a target site” and “altering a target site” areused interchangeably herein and refer to methods for producing analtered target site.

As used herein, “donor DNA” is a DNA construct that comprises apolynucleotide of interest to be inserted into the target site of a Casendonuclease.

The term “polynucleotide modification template” includes apolynucleotide that comprises at least one nucleotide modification whencompared to the nucleotide sequence to be edited. A nucleotidemodification can be at least one nucleotide substitution, addition ordeletion. Optionally, the polynucleotide modification template canfurther comprise homologous nucleotide sequences flanking the at leastone nucleotide modification, wherein the flanking homologous nucleotidesequences provide sufficient homology to the desired nucleotide sequenceto be edited.

The term “plant-optimized Cas endonuclease” herein refers to a Casprotein, including a multifunctional Cas protein, encoded by anucleotide sequence that has been optimized for expression in a plantcell or plant.

A “plant-optimized nucleotide sequence encoding a Cas endonuclease”,“plant-optimized construct encoding a Cas endonuclease” and a“plant-optimized polynucleotide encoding a Cas endonuclease” are usedinterchangeably herein and refer to a nucleotide sequence encoding a Casprotein, or a variant or functional fragment thereof, that has beenoptimized for expression in a plant cell or plant. A plant comprising aplant-optimized Cas endonuclease includes a plant comprising thenucleotide sequence encoding for the Cas sequence and/or a plantcomprising the Cas endonuclease protein. In one aspect, theplant-optimized Cas endonuclease nucleotide sequence is amaize-optimized, rice-optimized, wheat-optimized, soybean-optimized,cotton-optimized, or canola-optimized Cas endonuclease.

The term “plant” generically includes whole plants, plant organs, planttissues, seeds, plant cells, seeds and progeny of the same. Plant cellsinclude, without limitation, cells from seeds, suspension cultures,embryos, meristematic regions, callus tissue, leaves, roots, shoots,gametophytes, sporophytes, pollen and microspores. A “plant element” isintended to reference either a whole plant or a plant component, whichmay comprise differentiated and/or undifferentiated tissues, for examplebut not limited to plant tissues, parts, and cell types. In oneembodiment, a plant element is one of the following: whole plant,seedling, meristematic tissue, ground tissue, vascular tissue, dermaltissue, seed, leaf, root, shoot, stem, flower, fruit, stolon, bulb,tuber, corm, keiki, shoot, bud, tumor tissue, and various forms of cellsand culture (e.g., single cells, protoplasts, embryos, callus tissue).The term “plant organ” refers to plant tissue or a group of tissues thatconstitute a morphologically and functionally distinct part of a plant.As used herein, a “plant element” is synonymous to a “portion” of aplant, and refers to any part of the plant, and can include distincttissues and/or organs, and may be used interchangeably with the term“tissue” throughout. Similarly, a “plant reproductive element” isintended to generically reference any part of a plant that is able toinitiate other plants via either sexual or asexual reproduction of thatplant, for example but not limited to: seed, seedling, root, shoot,cutting, scion, graft, stolon, bulb, tuber, corm, keiki, or bud. Theplant element may be in plant or in a plant organ, tissue culture, orcell culture.

“Progeny” comprises any subsequent generation of a plant.

As used herein, the term “plant part” refers to plant cells, plantprotoplasts, plant cell tissue cultures from which plants can beregenerated, plant calli, plant clumps, and plant cells that are intactin plants or parts of plants such as embryos, pollen, ovules, seeds,leaves, flowers, branches, fruit, kernels, ears, cobs, husks, stalks,roots, root tips, anthers, and the like, as well as the partsthemselves. Grain is intended to mean the mature seed produced bycommercial growers for purposes other than growing or reproducing thespecies. Progeny, variants, and mutants of the regenerated plants arealso included within the scope of the invention, provided that theseparts comprise the introduced polynucleotides.

The term “monocotyledonous” or “monocot” refers to the subclass ofangiosperm plants also known as “monocotyledoneae”, whose seedstypically comprise only one embryonic leaf, or cotyledon. The termincludes references to whole plants, plant elements, plant organs (e.g.,leaves, stems, roots, etc.), seeds, plant cells, and progeny of thesame.

The term “dicotyledonous” or “dicot” refers to the subclass ofangiosperm plants also knows as “dicotyledoneae”, whose seeds typicallycomprise two embryonic leaves, or cotyledons. The term includesreferences to whole plants, plant elements, plant organs (e.g., leaves,stems, roots, etc.), seeds, plant cells, and progeny of the same.

As used herein, a “male sterile plant” is a plant that does not producemale gametes that are viable or otherwise capable of fertilization. Asused herein, a “female sterile plant” is a plant that does not producefemale gametes that are viable or otherwise capable of fertilization. Itis recognized that male-sterile and female-sterile plants can befemale-fertile and male-fertile, respectively. It is further recognizedthat a male fertile (but female sterile) plant can produce viableprogeny when crossed with a female fertile plant and that a femalefertile (but male sterile) plant can produce viable progeny when crossedwith a male fertile plant.

The term “non-conventional yeast” herein refers to any yeast that is nota Saccharomyces (e.g., S. cerevisiae) or Schizosaccharomyces yeastspecies. (see “Non-Conventional Yeasts in Genetics, Biochemistry andBiotechnology: Practical Protocols”, K. Wolf, K. D. Breunig, G. Barth,Eds., Springer-Verlag, Berlin, Germany, 2003).

The term “crossed” or “cross” or “crossing” in the context of thisdisclosure means the fusion of gametes via pollination to produceprogeny (i.e., cells, seeds, or plants). The term encompasses bothsexual crosses (the pollination of one plant by another) and selfing(self-pollination, i.e., when the pollen and ovule (or microspores andmegaspores) are from the same plant or genetically identical plants).

The term “introgression” refers to the transmission of a desired alleleof a genetic locus from one genetic background to another. For example,introgression of a desired allele at a specified locus can betransmitted to at least one progeny plant via a sexual cross between twoparent plants, where at least one of the parent plants has the desiredallele within its genome. Alternatively, for example, transmission of anallele can occur by recombination between two donor genomes, e.g., in afused protoplast, where at least one of the donor protoplasts has thedesired allele in its genome. The desired allele can be, e.g., atransgene, a modified (mutated or edited) native allele, or a selectedallele of a marker or QTL.

The term “isoline” is a comparative term, and references organisms thatare genetically identical, but differ in treatment. In one example, twogenetically identical maize plant embryos may be separated into twodifferent groups, one receiving a treatment (such as the introduction ofa CRISPR-Cas effector endonuclease) and one control that does notreceive such treatment. Any phenotypic differences between the twogroups may thus be attributed solely to the treatment and not to anyinherency of the plant's endogenous genetic makeup.

“Introducing” is intended to mean presenting to a target, such as a cellor organism, a polynucleotide or polypeptide or polynucleotide-proteincomplex, in such a manner that the component(s) gains access to theinterior of a cell of the organism or to the cell itself.

A “polynucleotide of interest” includes any nucleotide sequence encodinga protein or polypeptide that improves desirability of crops, i.e. atrait of agronomic interest. Polynucleotides of interest: include, butare not limited to, polynucleotides encoding important traits foragronomics, herbicide-resistance, insecticidal resistance, diseaseresistance, nematode resistance, herbicide resistance, microbialresistance, fungal resistance, viral resistance, fertility or sterility,grain characteristics, commercial products, phenotypic marker, or anyother trait of agronomic or commercial importance. A polynucleotide ofinterest may additionally be utilized in either the sense or anti-senseorientation. Further, more than one polynucleotide of interest may beutilized together, or “stacked”, to provide additional benefit.

A “complex trait locus” includes a genomic locus that has multipletransgenes genetically linked to each other.

The compositions and methods herein may provide for an improved“agronomic trait” or “trait of agronomic importance” or “trait ofagronomic interest” to a plant, which may include, but not be limitedto, the following: disease resistance, drought tolerance, heattolerance, cold tolerance, salinity tolerance, metal tolerance,herbicide tolerance, improved water use efficiency, improved nitrogenutilization, improved nitrogen fixation, pest resistance, herbivoreresistance, pathogen resistance, yield improvement, health enhancement,vigor improvement, growth improvement, photosynthetic capabilityimprovement, nutrition enhancement, altered protein content, altered oilcontent, increased biomass, increased shoot length, increased rootlength, improved root architecture, modulation of a metabolite,modulation of the proteome, increased seed weight, altered seedcarbohydrate composition, altered seed oil composition, altered seedprotein composition, altered seed nutrient composition, as compared toan isoline plant not comprising a modification derived from the methodsor compositions herein.

“Agronomic trait potential” is intended to mean a capability of a plantelement for exhibiting a phenotype, preferably an improved agronomictrait, at some point during its life cycle, or conveying said phenotypeto another plant element with which it is associated in the same plant.

The terms “decreased,” “fewer,” “slower” and “increased” “faster”“enhanced” “greater” as used herein refers to a decrease or increase ina characteristic of the modified plant element or resulting plantcompared to an unmodified plant element or resulting plant. For example,a decrease in a characteristic may be at least 1%, at least 2%, at least3%, at least 4%, at least 5%, between 5% and 10%, at least 10%, between10% and 20%, at least 15%, at least 20%, between 20% and 30%, at least25%, at least 30%, between 30% and 40%, at least 35%, at least 40%,between 40% and 50%, at least 45%, at least 50%, between 50% and 60%, atleast about 60%, between 60% and 70%, between 70% and 80%, at least 75%,at least about 80%, between 80% and 90%, at least about 90%, between 90%and 100%, at least 100%, between 100% and 200%, at least 200%, at leastabout 300%, at least about 400%) or more lower than the untreatedcontrol and an increase may be at least 1%, at least 2%, at least 3%, atleast 4%, at least 5%, between 5% and 10%, at least 10%, between 10% and20%, at least 15%, at least 20%, between 20% and 30%, at least 25%, atleast 30%, between 30% and 40%, at least 35%, at least 40%, between 40%and 50%, at least 45%, at least 50%, between 50% and 60%, at least about60%, between 60% and 70%, between 70% and 80%, at least 75%, at leastabout 80%, between 80% and 90%, at least about 90%, between 90% and100%, at least 100%, between 100% and 200%, at least 200%, at leastabout 300%, at least about 400% or more higher than the untreatedcontrol.

As used herein, the term “before”, in reference to a sequence position,refers to an occurrence of one sequence upstream, or 5′, to anothersequence.

The meaning of abbreviations is as follows: “sec” means second(s), “min”means minute(s), “h” means hour(s), “d” means day(s), “uL” meansmicroliter(s), “mL” means milliliter(s), “L” means liter(s), “uM” meansmicromolar, “mM” means millimolar, “M” means molar, “mmol” meansmillimole(s), “umole” or “umole” mean micromole(s), “g” means gram(s),“ug” or “ug” means microgram(s), “ng” means nanogram(s), “U” meansunit(s), “bp” means base pair(s) and “kb” means kilobase(s).

Double-Strand-Break (DSB) Inducing Agents (DSB Agents)

Double-strand breaks induced by double-strand-break-inducing agents,such as endonucleases that cleave the phosphodiester bond within apolynucleotide chain, can result in the induction of DNA repairmechanisms, including the non-homologous end-joining pathway, andhomologous recombination. Endonucleases include a range of differentenzymes, including restriction endonucleases (see e.g. Roberts et al.,(2003) Nucleic Acids Res 1:418-20), Roberts et al., (2003) Nucleic AcidsRes 31:1805-12, and Belfort et al., (2002) in Mobile DNA II, pp.761-783, Eds. Craigie et al., (ASM Press, Washington, D.C.)),meganucleases (see e.g., WO 2009/114321; Gao et al. (2010) Plant Journal1:176-187), TAL effector nucleases or TALENs (see e.g., US20110145940,Christian, M., T. Cermak, et al. 2010. Targeting DNA double-strandbreaks with TAL effector nucleases. Genetics 186(2): 757-61 and Boch etal., (2009), Science 326(5959): 1509-12), zinc finger nucleases (seee.g. Kim, Y. G., J. Cha, et al. (1996). “Hybrid restriction enzymes:zinc finger fusions to FokI cleavage”), and CRISPR-Cas endonucleases(see e.g. WO2007/025097 application published Mar. 1, 2007).

In addition to the double-strand break inducing agents, site-specificbase conversions can also be achieved to engineer one or more nucleotidechanges to create one or more EMEs described herein into the genome.These include for example, a site-specific base edit mediated by an C⋅Gto T⋅A or an A⋅T to G⋅C base editing deaminase enzymes (Gaudelli et al.,Programmable base editing of A⋅T to G⋅C in genomic DNA without DNAcleavage.” Nature (2017); Nishida et al. “Targeted nucleotide editingusing hybrid prokaryotic and vertebrate adaptive immune systems.”Science 353 (6305) (2016); Komor et al. “Programmable editing of atarget base in genomic DNA without double-stranded DNA cleavage.” Nature533 (7603) (2016):420-4.

Any double-strand-break or -nick or -modification inducing agent may beused for the methods described herein, including for example but notlimited to: Cas endonucleases, recombinases, TALENs, zinc fingernucleases, restriction endonucleases, meganucleases, and deaminases.

CRISPR Systems and Cas Endonucleases

Methods and compositions are provided for polynucleotide modificationwith a CRISPR Associated (Cas) endonuclease. Class I Cas endonucleasescomprise multisubunit effector complexes (Types I, III, and IV), whileClass 2 systems comprise single protein effectors (Types II, V, and VI)(Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15; Zetscheet al., 2015, Cell 163, 1-13; Shmakov et al., 2015, Molecular Cell 60,1-13; Haft et al., 2005, Computational Biology, PLoS Comput Biol 1(6):e60; and Koonin et al. 2017, Curr Opinion Microbiology 37:67-78). InClass 2 Type II systems, the Cas endonuclease acts in complex with aguide RNA (gRNA) that directs the Cas endonuclease to cleave the DNAtarget to enable target recognition, binding, and cleavage by the Casendonuclease. The gRNA comprises a Cas endonuclease recognition (CER)domain that interacts with the Cas endonuclease, and a VariableTargeting (VT) domain that hybridizes to a nucleotide sequence in atarget DNA. In some aspects, the gRNA comprises a CRISPR RNA (crRNA) anda trans-activating CRISPR RNA (tracrRNA) to guide the Cas endonucleaseto its DNA target. The crRNA comprises a spacer region complementary toone strand of the double strand DNA target and a region that base pairswith the tracrRNA, forming an RNA duplex. In some aspects, the gRNA is a“single guide RNA” (sgRNA) that comprises a synthetic fusion of crRNAand tracrRNA. In many systems, the Cas endonuclease-guide polynucleotidecomplex recognizes a short nucleotide sequence adjacent to the targetsequence (protospacer), called a “protospacer adjacent motif” (PAM).

Examples of a Cas endonuclease include but are not limited to Cas9 andCpf1. Cas9 (formerly referred to as Cas5, Csn1, or Csx12) is a Class 2Type II Cas endonuclease (Makarova et al. 2015, Nature ReviewsMicrobiology Vol. 13:1-15). A Cas9-gRNA complex recognizes a 3′ PAMsequence (NGG for the S. pyogenes Cas9) at the target site, permittingthe spacer of the guide RNA to invade the double-stranded DNA target,and, if sufficient homology between the spacer and protospacer exists,generate a double-strand break cleavage. Cas9 endonucleases compriseRuvC and HNH domains that together produce double strand breaks, andseparately can produce single strand breaks. For the S. pyogenes Cas9endonuclease, the double-strand break leaves a blunt end. Cpf1 is aClass 2 Type V Cas endonuclease, and comprises nuclease RuvC domain butlacks an HNH domain (Yamane et al., 2016, Cell 165:949-962). Cpf1endonucleases create “sticky” overhang ends.

Some uses for Cas9-gRNA systems at a genomic target site include but arenot limited to insertions, deletions, substitutions, or modifications ofone or more nucleotides at the target site; modifying or replacingnucleotide sequences of interest (such as a regulatory elements);insertion of polynucleotides of interest; gene knock-out; gene-knock in;modification of splicing sites and/or introducing alternate splicingsites; modifications of nucleotide sequences encoding a protein ofinterest; amino acid and/or protein fusions; and gene silencing byexpressing an inverted repeat into a gene of interest.

In some aspects, a “polynucleotide modification template” is providedthat comprises at least one nucleotide modification when compared to thenucleotide sequence to be edited. A nucleotide modification can be atleast one nucleotide substitution, addition, deletion, or chemicalalteration. Optionally, the polynucleotide modification template canfurther comprise homologous nucleotide sequences flanking the at leastone nucleotide modification, wherein the flanking homologous nucleotidesequences provide sufficient homology to the desired nucleotide sequenceto be edited.

In some aspects, a polynucleotide of interest is inserted at a targetsite and provided as part of a “donor DNA” molecule. As used herein,“donor DNA” is a DNA construct that comprises a polynucleotide ofinterest to be inserted into the target site of a Cas endonuclease. Thedonor DNA construct further comprises a first and a second region ofhomology that flank the polynucleotide of interest. The first and secondregions of homology of the donor DNA share homology to a first and asecond genomic region, respectively, present in or flanking the targetsite of the cell or organism genome. The donor DNA can be tethered tothe guide polynucleotide. Tethered donor DNAs can allow forco-localizing target and donor DNA, useful in genome editing, geneinsertion, and targeted genome regulation, and can also be useful intargeting post-mitotic cells where function of endogenous HR machineryis expected to be highly diminished (Mali et al., 2013, Nature MethodsVol. 10: 957-963). The amount of homology or sequence identity shared bya target and a donor polynucleotide can vary and includes total lengthsand/or regions.

The process for editing a genomic sequence at a Cas9-gRNAdouble-strand-break site with a modification template generallycomprises: providing a host cell with a Cas9-gRNA complex thatrecognizes a target sequence in the genome of the host cell and is ableto induce a single- or double-strand-break in the genomic sequence, andoptionally at least one polynucleotide modification template comprisingat least one nucleotide alteration when compared to the nucleotidesequence to be edited. The polynucleotide modification template canfurther comprise nucleotide sequences flanking the at least onenucleotide alteration, in which the flanking sequences are substantiallyhomologous to the chromosomal region flanking the double-strand break.Genome editing using double-strand-break-inducing agents, such asCas9-gRNA complexes, has been described, for example in US20150082478published on 19 Mar. 2015, WO2015026886 published on 26 Feb. 2015,WO2016007347 published 14 Jan. 2016, and WO2016025131 published on 18Feb. 2016.

To facilitate optimal expression and nuclear localization for eukaryoticcells, the gene comprising the Cas endonuclease may be optimized asdescribed in WO2016186953 published 24 Nov. 2016, and then deliveredinto cells as DNA expression cassettes by methods known in the art. Insome aspects, the Cas endonuclease is provided as a polypeptide. In someaspects, the Cas endonuclease is provided as a polynucleotide encoding apolypeptide. In some aspects, the guide RNA is provided as a DNAmolecule encoding one or more RNA molecules. In some aspects, the guideRNA is provide as RNA or chemically-modified RNA. In some aspects, theCas endonuclease protein and guide RNA are provided as aribonucleoprotein complex (RNP).

Once a double-strand break is induced in the genome, cellular DNA repairmechanisms are activated to repair the break.

Double-Strand-Break Repair and Polynucleotide Modification

A double-strand-break-inducing agent, such a guided Cas endonuclease canrecognize, bind to a DNA target sequence and introduce a single strand(nick) or double-strand break. Once a single or double-strand break isinduced in the DNA, the cell's DNA repair mechanism is activated torepair the break, for example via nonhomologous end-joining (NHEJ) orHomology-Directed Repair (HDR) processes which can lead to modificationsat the target site. The most common repair mechanism to bring the brokenends together is the nonhomologous end-joining (NHEJ) pathway (Bleuyardet al., (2006) DNA Repair 5:1-12). The structural integrity ofchromosomes is typically preserved by the repair, but deletions,insertions, or other rearrangements (such as chromosomal translocations)are possible (Siebert and Puchta, 2002, Plant Cell 14:1121-31; Pacher etal., 2007, Genetics 175:21-9). NHEJ is often error-prone and canintroduce small mutations in the target site. In plants, NHEJ is oftenthe major pathway by which DSBs are remediated; therefore, methods andcompositions to improve the probability of HDR or HR in plants aredesirable.

As described by Podevin (Podevin, N., Davies, H. V., Hartung, F., Nogue,F. and Casacuberta, J. M. (2013) Site-directed nucleases: a paradigmshift in predictable, knowledge-based plant breeding. Trends Biotechnol.31(6), 375-383), Hilscher (Hilscher, J., Burstmayr, H. and Stoger, E.(2016) Targeted modification of plant genomes for precision cropbreeding. Biotechnol. J. 11, 1-14), and Pacher (Pacher and Puchta(2016), From classical mutagenesis to nuclease-based breeding—directingnatural DNA repair for a natural end-product. The Plant Journal90(4):819-833), three categories of site-directed nuclease mediatedgenome modification have been defined, according to the European Union(EU) New Techniques Working Group (NTWG; European Commission et al.)classification of ZFN activity and regulatory purposes:

SDN1 covers the application of a SDN without an additional donor DNA orrepair template. Thus the reaction outcome clearly depends on the DSBrepair pathway of the plant genome. As the predominant DSB repairpathway is NHEJ, small insertions or deletions can occur (SDN1a). In thecase of tandemly arranged SDNs, larger deletions can be obtained(SDN1b). Furthermore, inversions (SDN1c) or translocations (SDN1d) canbe generated by multiplexed SDN1 approaches (Hilscher et al., 2016).

SDN2 describes the use of a SDN with an additional DNA “polynucleotidemodification template” to introduce small mutations in a controlledmanner. Here, a template mainly homologous to the target sequence isprovided to be the substrate for HR-mediated DSB repair following theinduction of one or two adjacent DSBs. This approach allows theintroduction of small mutations that could also occur naturally, per se.Taking the size of plant genomes into account, small modifications up to20 nucleotides can statistically be regarded as GE that resemblesnaturally occurring genome changes. Therefore, targeted genomemodifications using ODM are also regarded comparable to SDN2.

SDN3 describes the use of a SDN with an additional “donorpolynucleotide” or “donor DNA” to introduce large stretches of exogenousDNA at a pre-determined locus, adding or replacing genetic information.Mechanistically, this process relies on HR-mediated DSB repair likeSDN2, and the discrimination is arbitrary as the size of the sequenceinserted can vary significantly.

Both SDN2 and SDN3 are types of homology-directed repair (HDR) of adouble-strand break in a polynucleotide, and involve methods ofintroducing a heterologous polynucleotide as either a template forrepair of the double strand break (SDN2), or insertion of a newdouble-stranded polynucleotide at the double strand break site (SDN3).SDN2 repairs may be detected by the presence of one or a few nucleotidechanges (mutations). SDN3 repairs may be detected by the presence of anovel contiguous heterologous polynucleotide.

Modification of a target polynucleotide includes any one or more of thefollowing: insertion of at least one nucleotide, deletion of at leastone nucleotide, chemical alteration of at least one nucleotide,replacement of at least one nucleotide, or mutation of at least onenucleotide. In some aspects, the DNA repair mechanism creates animperfect repair of the double-strand break, resulting in a change of anucleotide at the break site. In some aspects, a polynucleotide templatemay be provided to the break site, wherein the repair results in atemplate-directed repair of the break. In some aspects, a donorpolynucleotide may be provided to the break site, wherein the repairresults in the incorporation of the donor polynucleotide into the breaksite.

In some aspects, the methods and compositions described herein improvethe probability of a non-NHEJ repair mechanism outcome at a DSB. In oneaspect, an increase of the HDR to NHEJ repair ratio is effected.

Homology-Directed Repair and Homologous Recombination

Homology-directed repair (HDR) is a mechanism in cells to repairdouble-stranded and single stranded DNA breaks. Homology-directed repairincludes homologous recombination (HR) and single-strand annealing (SSA)(Lieber. 2010 Annu. Rev. Biochem. 79:181-211). The most common form ofHDR is called homologous recombination (HR), which has the longestsequence homology requirements between the donor and acceptor DNA. Otherforms of HDR include single-stranded annealing (SSA) andbreakage-induced replication, and these require shorter sequencehomology relative to HR. Homology-directed repair at nicks(single-stranded breaks) can occur via a mechanism distinct from HDR atdouble-strand breaks (Davis and Maizels. PNAS (0027-8424), 111 (10), p.E924-E932).

By “homology” is meant DNA sequences that are similar. For example, a“region of homology to a genomic region” that is found on the donor DNAis a region of DNA that has a similar sequence to a given “genomicregion” in the cell or organism genome. A region of homology can be ofany length that is sufficient to promote homologous recombination at thecleaved target site. For example, the region of homology can comprise atleast 5-10, 5-15, 5-20, 5-25, 5-30, 5-35, 5-40, 5-45, 5-50, 5-55, 5-60,5-65, 5-70, 5-75, 5-80, 5-85, 5-90, 5-95, 5-100, 5-200, 5-300, 5-400,5-500, 5-600, 5-700, 5-800, 5-900, 5-1000, 5-1100, 5-1200, 5-1300,5-1400, 5-1500, 5-1600, 5-1700, 5-1800, 5-1900, 5-2000, 5-2100, 5-2200,5-2300, 5-2400, 5-2500, 5-2600, 5-2700, 5-2800, 5-2900, 5-3000, 5-3100or more bases in length such that the region of homology has sufficienthomology to undergo homologous recombination with the correspondinggenomic region. “Sufficient homology” indicates that two polynucleotidesequences have sufficient structural similarity to act as substrates fora homologous recombination reaction. The structural similarity includesoverall length of each polynucleotide fragment, as well as the sequencesimilarity of the polynucleotides. Sequence similarity can be describedby the percent sequence identity over the whole length of the sequences,and/or by conserved regions comprising localized similarities such ascontiguous nucleotides having 100% sequence identity, and percentsequence identity over a portion of the length of the sequences.

The amount of homology or sequence identity shared by a target and adonor polynucleotide can vary and includes total lengths and/or regionshaving unit integral values in the ranges of about 1-20 bp, 20-50 bp,50-100 bp, 75-150 bp, 100-250 bp, 150-300 bp, 200-400 bp, 250-500 bp,300-600 bp, 350-750 bp, 400-800 bp, 450-900 bp, 500-1000 bp, 600-1250bp, 700-1500 bp, 800-1750 bp, 900-2000 bp, 1-2.5 kb, 1.5-3 kb, 2-4 kb,2.5-5 kb, 3-6 kb, 3.5-7 kb, 4-8 kb, 5-10 kb, or up to and including thetotal length of the target site. These ranges include every integerwithin the range, for example, the range of 1-20 bp includes 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 and 20 bps. Theamount of homology can also be described by percent sequence identityover the full aligned length of the two polynucleotides which includespercent sequence identity of about at least 50%, 55%, 60%, 65%, 70%,71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%,85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%,99% or 100%. Sufficient homology includes any combination ofpolynucleotide length, global percent sequence identity, and optionallyconserved regions of contiguous nucleotides or local percent sequenceidentity, for example sufficient homology can be described as a regionof 75-150 bp having at least 80% sequence identity to a region of thetarget locus. Sufficient homology can also be described by the predictedability of two polynucleotides to specifically hybridize under highstringency conditions, see, for example, Sambrook et al., (1989)Molecular Cloning: A Laboratory Manual, (Cold Spring Harbor LaboratoryPress, NY); Current Protocols in Molecular Biology, Ausubel et al., Eds(1994) Current Protocols, (Greene Publishing Associates, Inc. and JohnWiley & Sons, Inc.); and, Tijssen (1993) Laboratory Techniques inBiochemistry and Molecular Biology—Hybridization with Nucleic AcidProbes, (Elsevier, New York).

DNA double-strand breaks can be an effective factor to stimulatehomologous recombination pathways (Puchta et al., (1995) Plant Mol Biol28:281-92; Tzfira and White, (2005) Trends Biotechnol 23:567-9; Puchta,(2005) J Exp Bot 56:1-14). Using DNA-breaking agents, a two- tonine-fold increase of homologous recombination was observed betweenartificially constructed homologous DNA repeats in plants (Puchta etal., (1995) Plant Mol Biol 28:281-92). In maize protoplasts, experimentswith linear DNA molecules demonstrated enhanced homologous recombinationbetween plasmids (Lyznik et al., (1991) Mol Gen Genet 230:209-18).

Alteration of the genome of a prokaryotic and eukaryotic cell ororganism cell, for example, through homologous recombination (HR), is apowerful tool for genetic engineering. Homologous recombination has beendemonstrated in plants (Halfter et al., (1992) Mol Gen Genet 231:186-93)and insects (Dray and Gloor, 1997, Genetics 147:689-99). Homologousrecombination has also been accomplished in other organisms. Forexample, at least 150-200 bp of homology was required for homologousrecombination in the parasitic protozoan Leishmania (Papadopoulou andDumas, (1997) Nucleic Acids Res 25:4278-86). In the filamentous fungusAspergillus nidulans, gene replacement has been accomplished with aslittle as 50 bp flanking homology (Chaveroche et al., (2000) NucleicAcids Res 28:e97). Targeted gene replacement has also been demonstratedin the ciliate Tetrahymena thermophila (Gaertig et al., (1994) NucleicAcids Res 22:5391-8). In mammals, homologous recombination has been mostsuccessful in the mouse using pluripotent embryonic stem cell lines (ES)that can be grown in culture, transformed, selected and introduced intoa mouse embryo (Watson et al., 1992, Recombinant DNA, 2nd Ed.,Scientific American Books distributed by WH Freeman & Co.).

Improving the Probability of HDR in DSB Repair

Several methods for encouraging the repair of a double strand break viaHDR are contemplated, based on the facts that (1) Cas9 has a highaffinity for, and is slow to release, its cleaved substrate (Richardson,C. et al. (2016) Nat. Biotechnol. 34:339-344); and (2) the observationby the inventors that the mutation outcomes for polynucleotide cleavageare often non-random and reproducible (unpublished). The inventors haveconceived that retargeting a polynucleotide double-strand-break site,providing multiple opportunities for DSB repair, encourages theoccurrence of HDR (e.g., HR) vs NHEJ. The inventors have also conceivedthat because recombinogenic intermediates involve 3′ overhangs,additional single strand breaks flanking the double-strand break sitewill produce destabilized duplexes, leading to a recombinogenicintermediate. In some cases, different endonucleases (e.g., fromdifferent source organisms or CRISPR loci, or engineered enzymes, ornickases) are used.

In some aspects, the fraction or percent of HR reads is greater than ofa comparator, such as a control sample, sample with NHEJ repair, or ascompared to the total mutant reads. In some aspects, the fraction orpercent of HR reads is greater than of the control sample (no DSBagent). In some aspects, the fraction or percent of HR reads is greaterthan the fraction or percent of NHEJ reads. In some aspects, thefraction or percent of HR reads is greater than the fraction or percentof total mutant reads (NHEJ+HR).

In some aspects, the fraction of HR reads relative to a comparator is atleast 2, 3, 4, 5, 6, 7, 8, 9, 10, between 10 and 15, 15, between 15 and20, 20, between 20 and 25, 25, between 25 and 30, 30, between 30 and 40,40, between 40 and 50, 50, between 50 and 60, 60, between 60 and 70, 70,between 70 and 80, 80, between 80 and 90, 90, between 90 and 100, 100,between 100 and 125, 125, between 125 and 150, greater than 150, orinfinitely greater.

In some aspects, the percent of HR reads relative a comparator is atleast 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%,17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 20%,31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%,45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%,59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%,73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%,87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%greater.

In some aspects, the percent of HR reads is greater than zero.

In one aspect of the method, a double-strand-break is created, repaired,and recurrently cleaved by any method or composition, for example butnot limited to a Cas endonuclease and guide RNA. Briefly, a DSB inducingagent (e.g., Cas endonuclease and first guide RNA) recognize, bind to,and cleave a target polynucleotide. A first double-strand-break iscreated, and repaired. In some aspects, the repair results in a changeof the target site polynucleotide sequence (for example, but not limitedto, an insertion of a nucleotide, a deletion of a nucleotide, or areplacement of a nucleotide). In some aspects, a repair template isprovided for a specific target polynucleotide repair compositionoutcome. In this case, the repair template is flanked with invertedtarget site (PAM on the inside). A second guide RNA is introduced thatis complementary to the mutation that was created by the repair by thefirst double-strand break. In some aspects, the DSB repair compositionoutcome is determined by the introduction of a donor polynucleotidetemplate or insertion, and the second guide RNA designed to becomplementary to that determined target sequence outcome. In someaspects, the second guide RNA is designed to be complementary to themost commonly created repair mutation. In some aspects, the second guideRNA is designed to be complementary to a desired DNA repair outcome. Insome aspects, a library of second guide RNAs is designed that arecomplementary to all possible mutations of the target site. Themutation(s) created by the first double-strand-break repair may beeither known or predicted bioinformatically. The second guide RNA actsin concert with the Cas endonuclease (either provided de novo or thesame Cas endonuclease that was present for the first DSB) to create asecond double-strand-break at the same site (within the on-targetrecognition sequence of the Cas endonuclease/first guide RNA complex).In some aspects, instead of a second guide RNA and a Cas endonucleasecreating the second DSB, another DSB inducing reagent may be introduced.The second DSB has a higher probability of repair by HDR than NHEJ, ascompared to the repair of the first DSB (i.e., the probability of HDR isincreased, or the frequency of HDR is increased, or the ratio of HDR toNHEJ is increased). In general, there is a subsequent cut at a previouscut site, which in some aspects can be accomplished by the introductionof another Cas endonuclease/gRNA complex. Continued cleavage in asequential manner increases the frequency of HDR as a DSB repairmechanism.

In one aspect of the method, a double-strand-break is created, repaired,and recursively cleaved by any method or composition, for example butnot limited to a Cas endonuclease and guide RNA. Briefly, a DSB inducingagent (e.g., Cas endonuclease and first guide RNA) recognize, bind to,and cleave a target polynucleotide. The first guide RNA is provided as aDNA sequence on a plasmid that further comprises a spacer sequence. Insome aspects, the DNA encoding the gRNA is operably linked to aregulatory expression element. A first double-strand-break is created,and repaired. The composition of the repaired target polynucleotide isused as the basis of a mutation generated by Cas editing of the spaceron the plasmid comprising the gRNA DNA and spacer. The mutated spacercomposition directs the generation of a second gRNA that iscomplementary to the sequence of the repaired targeted polynucleotide ofthe first DSB, and a second double-strand break is induced at the targetsite by the Cas endonuclease and the second gRNA. The cycle may thenrepeat, with sequence of the newly repaired second DSB then being usedas a template for the composition of a third gRNA that is complementaryto the sequence of the repaired second DSB polynucleotide, and so forth.In this manner a loop of DSB generation and repair occurs, with eachsubsequent repair after the first having a higher probability of repairvia HDR than NHEJ, as compared to the mechanism of the first repair. Theprocess may stop by any of a number of methods, including but notlimited to: titrated reagent availability, mutation induced in theregion of the gRNA DNA expression construct that renders the expressioncassette or transcribed gRNA to be non-functional, an external factorthat may optionally be inducible or repressible, or via the introductionof another molecule.

In one aspect of the method, a nick (cleavage of double-stranded DNA ononly one of the two phosphate backbones) is created adjacent to adouble-strand-break on a target polynucleotide. In one variation of thisaspect a single nick is created. In one variation of this aspect, twonicks are created. In one variation of this aspect, two nicks arecreated, one each flanking the two sides of the DSB. In one embodiment,the double-strand-break is created by one Cas endonuclease, and thenick(s) is(are) created by a different molecule (e.g., a moleculederived from a different organism, or a Cas endonuclease that lacksdouble strand break creation functionality but possesses nickaseactivity (for example, nCas9)). Due to the presence of adjacent nick(s),double-strand-break repair of the DSB at the target site has a higherprobability of being repaired by HDR than by NHEJ, or has a higherfrequency of HDR as compared to a DSB at the same locus that does nothave one or more nicks adjacent to the DSB. In some aspects, thedistance between the nick and the DSB site is 10 basepairs, between 10and 20 basepairs, 20 basepairs, between 20 and 30 basepairs, 30basepairs, between 30 and 40 basepairs, 40 basepairs, between 40 and 50basepairs, 50 basepairs, between 50 and 60 basepairs, 60 basepairs,between 60 and 70 basepairs, 70 basepairs, between 70 and 80 basepairs,80 basepairs, between 80 and 90 basepairs, 90 basepairs, between 90 and100 basepairs, 100 basepairs, between 100 and 110 basepairs, 110basepairs, between 110 and 120 basepairs, or greater than 120 basepairsin length.

In addition to improving the probability of an HDR repair mechanismoutcome, other DNA repair outcomes that are contemplated to be improvedwith the methods described herein include gene targeting, gene editing,gene drop-out, gene swap (deletion plus insertion), and promoter swap(deletion plus insertion).

Gene Targeting

The compositions and methods described herein can be used for genetargeting.

In general, DNA targeting can be performed by cleaving one or bothstrands at a specific polynucleotide sequence in a cell with a Casendonuclease associated with a suitable guide polynucleotide component.Once a single or double-strand break is induced in the DNA, the cell'sDNA repair mechanism is activated to repair the break via nonhomologousend-joining (NHEJ) or Homology-Directed Repair (HDR) processes which canlead to modifications at the target site.

The length of the DNA sequence at the target site can vary, andincludes, for example, target sites that are at least 12, 13, 14, 15,16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more than30 nucleotides in length. It is further possible that the target sitecan be palindromic, that is, the sequence on one strand reads the samein the opposite direction on the complementary strand. The nick/cleavagesite can be within the target sequence or the nick/cleavage site couldbe outside of the target sequence. In another variation, the cleavagecould occur at nucleotide positions immediately opposite each other toproduce a blunt end cut or, in other cases, the incisions could bestaggered to produce single-stranded overhangs, also called “stickyends”, which can be either 5′ overhangs, or 3′ overhangs. Activevariants of genomic target sites can also be used. Such active variantscan comprise at least 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%,95%, 96%, 97%, 98%, 99% or more sequence identity to the given targetsite, wherein the active variants retain biological activity and henceare capable of being recognized and cleaved by an Cas endonuclease.

Assays to measure the single or double-strand break of a target site byan endonuclease are known in the art and generally measure the overallactivity and specificity of the agent on DNA substrates comprisingrecognition sites.

A targeting method herein can be performed in such a way that two ormore DNA target sites are targeted in the method, for example. Such amethod can optionally be characterized as a multiplex method. Two,three, four, five, six, seven, eight, nine, ten, or more target sitescan be targeted at the same time in certain embodiments. A multiplexmethod is typically performed by a targeting method herein in whichmultiple different RNA components are provided, each designed to guide aguide polynucleotide/Cas endonuclease complex to a unique DNA targetsite.

Gene Editing

The process for editing a genomic sequence combining DSB andmodification templates generally comprises: introducing into a host cella DSB-inducing agent, or a nucleic acid encoding a DSB-inducing agent,that recognizes a target sequence in the chromosomal sequence and isable to induce a DSB in the genomic sequence, and at least onepolynucleotide modification template comprising at least one nucleotidealteration when compared to the nucleotide sequence to be edited. Thepolynucleotide modification template can further comprise nucleotidesequences flanking the at least one nucleotide alteration, in which theflanking sequences are substantially homologous to the chromosomalregion flanking the DSB. Genome editing using DSB-inducing agents, suchas Cas-gRNA complexes, has been described, for example in US20150082478published on 19 Mar. 2015, WO2015026886 published on 26 Feb. 2015,WO2016007347 published 14 Jan. 2016, and WO/2016/025131 published on 18Feb. 2016.

Some uses for guide RNA/Cas endonuclease systems have been described(see for example: US20150082478 A1 published 19 Mar. 2015, WO2015026886published 26 Feb. 2015, and US20150059010 published 26 Feb. 2015) andinclude but are not limited to modifying or replacing nucleotidesequences of interest (such as a regulatory elements), insertion ofpolynucleotides of interest, gene drop-out, gene knock-out, gene-knockin, modification of splicing sites and/or introducing alternate splicingsites, modifications of nucleotide sequences encoding a protein ofinterest, amino acid and/or protein fusions, and gene silencing byexpressing an inverted repeat into a gene of interest.

Proteins may be altered in various ways including amino acidsubstitutions, deletions, truncations, and insertions. Methods for suchmanipulations are generally known. For example, amino acid sequencevariants of the protein(s) can be prepared by mutations in the DNA.Methods for mutagenesis and nucleotide sequence alterations include, forexample, Kunkel, (1985) Proc. Natl. Acad. Sci. USA 82:488-92; Kunkel etal., (1987) Meth Enzymol 154:367-82; U.S. Pat. No. 4,873,192; Walker andGaastra, eds. (1983) Techniques in Molecular Biology (MacMillanPublishing Company, New York) and the references cited therein. Guidanceregarding amino acid substitutions not likely to affect biologicalactivity of the protein is found, for example, in the model of Dayhoffet al., (1978) Atlas of Protein Sequence and Structure (Natl Biomed ResFound, Washington, D.C.). Conservative substitutions, such as exchangingone amino acid with another having similar properties, may bepreferable. Conservative deletions, insertions, and amino acidsubstitutions are not expected to produce radical changes in thecharacteristics of the protein, and the effect of any substitution,deletion, insertion, or combination thereof can be evaluated by routinescreening assays. Assays for double-strand-break-inducing activity areknown and generally measure the overall activity and specificity of theagent on DNA substrates comprising target sites.

Described herein are methods for genome editing with Cleavage ReadyCascade (crCascade) Complexes. Following characterization of the guideRNA and PAM sequence, components of the cleavage ready Cascade(crCascade) complex and associated CRISPR RNA (crRNA) may be utilized tomodify chromosomal DNA in other organisms including plants. Tofacilitate optimal expression and nuclear localization (for eukaryoticcells), the genes comprising the crCascade may be optimized as describedin WO2016186953 published 24 Nov. 2016, and then delivered into cells asDNA expression cassettes by methods known in the art. The componentsnecessary to comprise an active crCascade complex may also be deliveredas RNA with or without modifications that protect the RNA fromdegradation or as mRNA capped or uncapped (Zhang, Y. et al., 2016, Nat.Commun. 7:12617) or Cas protein guide polynucleotide complexes(WO2017070032 published 27 Apr. 2017), or any combination thereof.Additionally, a part or part(s) of the crCascade complex and crRNA maybe expressed from a DNA construct while other components are deliveredas RNA with or without modifications that protect the RNA fromdegradation or as mRNA capped or uncapped (Zhang et al. 2016 Nat.Commun. 7:12617) or Cas protein guide polynucleotide complexes(WO2017070032 published 27 Apr. 2017) or any combination thereof. Toproduce crRNAs in-vivo, tRNA derived elements may also be used torecruit endogenous RNAses to cleave crRNA transcripts into mature formscapable of guiding the crCascade complex to its DNA target site, asdescribed, for example, in WO2017105991 published 22 Jun. 2017.crCascade nickase complexes may be utilized separately or concertedly togenerate a single or multiple DNA nicks on one or both DNA strands.Furthermore, the cleavage activity of the Cas endonuclease may bedeactivated by altering key catalytic residues in its cleavage domain(Sinkunas, T. et al., 2013, EMBO J. 32:385-394) resulting in an RNAguided helicase that may be used to enhance homology-directed repair,induce transcriptional activation, or remodel local DNA structures.Moreover, the activity of the Cas cleavage and helicase domains may bothbe knocked-out and used in combination with other DNA cutting, DNAnicking, DNA binding, transcriptional activation, transcriptionalrepression, DNA remodeling, DNA deamination, DNA unwinding, DNArecombination enhancing, DNA integration, DNA inversion, and DNA repairagents.

The transcriptional direction of the tracrRNA for the CRISPR-Cas system(if present) and other components of the CRISPR-Cas system (such asvariable targeting domain, crRNA repeat, loop, anti-repeat) can bededuced as described in WO2016186946 published 24 Nov. 2016, andWO2016186953 published 24 Nov. 2016.

As described herein, once the appropriate guide RNA requirement isestablished, the PAM preferences for each new system disclosed hereinmay be examined. If the cleavage ready Cascade (crCascade) complexresults in degradation of the randomized PAM library, the crCascadecomplex can be converted into a nickase by disabling the ATPasedependent helicase activity either through mutagenesis of criticalresidues or by assembling the reaction in the absence of ATP asdescribed previously (Sinkunas, T. et al., 2013, EMBO J. 32:385-394).Two regions of PAM randomization separated by two protospacer targetsmay be utilized to generate a double-stranded DNA break which may becaptured and sequenced to examine the PAM sequences that supportcleavage by the respective crCascade complex.

In one embodiment, the invention describes a method for modifying atarget site in the genome of a cell, the method comprising introducinginto a cell at least one Cas endonuclease and guide RNA, and identifyingat least one cell that has a modification at the target site.

The nucleotide to be edited can be located within or outside a targetsite recognized and cleaved by a Cas endonuclease. In one embodiment,the at least one nucleotide modification is not a modification at atarget site recognized and cleaved by a Cas endonuclease. In anotherembodiment, there are at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 30, 40, 50,100, 200, 300, 400, 500, 600, 700, 900 or 1000 nucleotides between theat least one nucleotide to be edited and the genomic target site.

A knock-out may be produced by an indel (insertion or deletion ofnucleotide bases in a target DNA sequence through NHEJ), or by specificremoval of sequence that reduces or completely destroys the function ofsequence at or near the targeting site.

A guide polynucleotide/Cas endonuclease induced targeted mutation canoccur in a nucleotide sequence that is located within or outside agenomic target site that is recognized and cleaved by the Casendonuclease.

The method for editing a nucleotide sequence in the genome of a cell canbe a method without the use of an exogenous selectable marker byrestoring function to a non-functional gene product.

In one embodiment, the invention describes a method for modifying atarget site in the genome of a cell, the method comprising introducinginto a cell at least one PGEN described herein and at least one donorDNA, wherein said donor DNA comprises a polynucleotide of interest, andoptionally, further comprising identifying at least one cell that saidpolynucleotide of interest integrated in or near said target site.

In one aspect, the methods disclosed herein may employ homologousrecombination (HR) to provide integration of the polynucleotide ofinterest at the target site.

Various methods and compositions can be employed to produce a cell ororganism having a polynucleotide of interest inserted in a target sitevia activity of a CRISPR-Cas system component described herein. In onemethod described herein, a polynucleotide of interest is introduced intothe organism cell via a donor DNA construct. As used herein, “donor DNA”is a DNA construct that comprises a polynucleotide of interest to beinserted into the target site of a Cas endonuclease. The donor DNAconstruct further comprises a first and a second region of homology thatflank the polynucleotide of interest. The first and second regions ofhomology of the donor DNA share homology to a first and a second genomicregion, respectively, present in or flanking the target site of the cellor organism genome.

The donor DNA can be tethered to the guide polynucleotide. Tethereddonor DNAs can allow for co-localizing target and donor DNA, useful ingenome editing, gene insertion, and targeted genome regulation, and canalso be useful in targeting post-mitotic cells where function ofendogenous HR machinery is expected to be highly diminished (Mali etal., 2013, Nature Methods Vol. 10: 957-963).

The amount of homology or sequence identity shared by a target and adonor polynucleotide can vary and includes total lengths and/or regionshaving unit integral values in the ranges of about 1-20 bp, 20-50 bp,50-100 bp, 75-150 bp, 100-250 bp, 150-300 bp, 200-400 bp, 250-500 bp,300-600 bp, 350-750 bp, 400-800 bp, 450-900 bp, 500-1000 bp, 600-1250bp, 700-1500 bp, 800-1750 bp, 900-2000 bp, 1-2.5 kb, 1.5-3 kb, 2-4 kb,2.5-5 kb, 3-6 kb, 3.5-7 kb, 4-8 kb, 5-10 kb, or up to and including thetotal length of the target site. These ranges include every integerwithin the range, for example, the range of 1-20 bp includes 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 and 20 bps. Theamount of homology can also be described by percent sequence identityover the full aligned length of the two polynucleotides which includespercent sequence identity of about at least 50%, 55%, 60%, 65%, 70%,71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%,85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%,99% or 100%. Sufficient homology includes any combination ofpolynucleotide length, global percent sequence identity, and optionallyconserved regions of contiguous nucleotides or local percent sequenceidentity, for example sufficient homology can be described as a regionof 75-150 bp having at least 80% sequence identity to a region of thetarget locus. Sufficient homology can also be described by the predictedability of two polynucleotides to specifically hybridize under highstringency conditions, see, for example, Sambrook et al., (1989)Molecular Cloning: A Laboratory Manual, (Cold Spring Harbor LaboratoryPress, NY); Current Protocols in Molecular Biology, Ausubel et al., Eds(1994) Current Protocols, (Greene Publishing Associates, Inc. and JohnWiley & Sons, Inc.); and, Tijssen (1993) Laboratory Techniques inBiochemistry and Molecular Biology—Hybridization with Nucleic AcidProbes, (Elsevier, New York).

Episomal DNA molecules can also be ligated into the double-strand break,for example, integration of T-DNAs into chromosomal double-strand breaks(Chilton and Que, (2003) Plant Physiol 133:956-65; Salomon and Puchta,(1998) EMBO 1 17:6086-95). Once the sequence around the double-strandbreaks is altered, for example, by exonuclease activities involved inthe maturation of double-strand breaks, gene conversion pathways canrestore the original structure if a homologous sequence is available,such as a homologous chromosome in non-dividing somatic cells, or asister chromatid after DNA replication (Molinier et al., (2004) PlantCell 16:342-52). Ectopic and/or epigenic DNA sequences may also serve asa DNA repair template for homologous recombination (Puchta, (1999)Genetics 152:1173-81).

In one embodiment, the disclosure comprises a method for editing anucleotide sequence in the genome of a cell, the method comprisingintroducing into at least one PGEN described herein, and apolynucleotide modification template, wherein said polynucleotidemodification template comprises at least one nucleotide modification ofsaid nucleotide sequence, and optionally further comprising selecting atleast one cell that comprises the edited nucleotide sequence.

The guide polynucleotide/Cas endonuclease system can be used incombination with at least one polynucleotide modification template toallow for editing (modification) of a genomic nucleotide sequence ofinterest. (See also US20150082478, published 19 Mar. 2015 andWO2015026886 published 26 Feb. 2015).

Polynucleotides of interest and/or traits can be stacked together in acomplex trait locus as described in WO2012129373 published 27 Sep. 2012,and in WO2013112686, published 1 Aug. 2013. The guidepolynucleotide/Cas9 endonuclease system described herein provides for anefficient system to generate double-strand breaks and allows for traitsto be stacked in a complex trait locus.

A guide polynucleotide/Cas system as described herein, mediating genetargeting, can be used in methods for directing heterologous geneinsertion and/or for producing complex trait loci comprising multipleheterologous genes in a fashion similar as disclosed in WO2012129373published 27 Sep. 2012, where instead of using a double-strand breakinducing agent to introduce a gene of interest, a guidepolynucleotide/Cas system as disclosed herein is used. By insertingindependent transgenes within 0.1, 0.2, 0.3, 0.4, 0.5, 1.0, 2, or even 5centimorgans (cM) from each other, the transgenes can be bred as asingle genetic locus (see, for example, US20130263324 published 3 Oct.2013 or WO2012129373 published 14 Mar. 2013). After selecting a plantcomprising a transgene, plants comprising (at least) one transgenes canbe crossed to form an F1 that comprises both transgenes. In progeny fromthese F1 (F2 or BC1) 1/500 progeny would have the two differenttransgenes recombined onto the same chromosome. The complex locus canthen be bred as single genetic locus with both transgene traits. Thisprocess can be repeated to stack as many traits as desired.

Further uses for guide RNA/Cas endonuclease systems have been described(See for example: US20150082478 published 19 Mar. 2015, WO2015026886published 26 Feb. 2015, US20150059010 published 26 Feb. 2015,WO2016007347 published 14 Jan. 2016, and PCT application WO2016025131published 18 Feb. 2016) and include but are not limited to modifying orreplacing nucleotide sequences of interest (such as a regulatoryelements), insertion of polynucleotides of interest, gene knock-out,gene-knock in, modification of splicing sites and/or introducingalternate splicing sites, modifications of nucleotide sequences encodinga protein of interest, amino acid and/or protein fusions, and genesilencing by expressing an inverted repeat into a gene of interest.

Resulting characteristics from the gene editing compositions and methodsdescribed herein may be evaluated. Chromosomal intervals that correlatewith a phenotype or trait of interest can be identified. A variety ofmethods well known in the art are available for identifying chromosomalintervals. The boundaries of such chromosomal intervals are drawn toencompass markers that will be linked to the gene controlling the traitof interest. In other words, the chromosomal interval is drawn such thatany marker that lies within that interval (including the terminalmarkers that define the boundaries of the interval) can be used as amarker for a particular trait. In one embodiment, the chromosomalinterval comprises at least one QTL, and furthermore, may indeedcomprise more than one QTL. Close proximity of multiple QTLs in the sameinterval may obfuscate the correlation of a particular marker with aparticular QTL, as one marker may demonstrate linkage to more than oneQTL. Conversely, e.g., if two markers in close proximity showco-segregation with the desired phenotypic trait, it is sometimesunclear if each of those markers identifies the same QTL or twodifferent QTL. The term “quantitative trait locus” or “QTL” refers to aregion of DNA that is associated with the differential expression of aquantitative phenotypic trait in at least one genetic background, e.g.,in at least one breeding population. The region of the QTL encompassesor is closely linked to the gene or genes that affect the trait inquestion. An “allele of a QTL” can comprise multiple genes or othergenetic factors within a contiguous genomic region or linkage group,such as a haplotype. An allele of a QTL can denote a haplotype within aspecified window wherein said window is a contiguous genomic region thatcan be defined, and tracked, with a set of one or more polymorphicmarkers. A haplotype can be defined by the unique fingerprint of allelesat each marker within the specified window.

In Vitro Modification of Polynucleotides

The compositions disclosed herein may further be used as compositionsfor use in in vitro methods, in some aspects with isolatedpolynucleotide sequence(s). Said isolated polynucleotide sequence(s) maycomprise one or more target sequence(s) for modification. In someaspects, said isolated polynucleotide sequence(s) may be genomic DNA, aPCR product, or a synthesized oligonucleotide.

Modification of a target sequence may be in the form of a nucleotideinsertion, a nucleotide deletion, a nucleotide substitution, theaddition of an atom molecule to an existing nucleotide, a nucleotidemodification, or the binding of a heterologous polynucleotide orpolypeptide to said target sequence. The insertion of one or morenucleotides may be accomplished by the inclusion of a donorpolynucleotide in the reaction mixture: said donor polynucleotide isinserted into a double-strand break created by said Cas endonuclease.The insertion may be via non-homologous end joining or via homologousrecombination.

In one aspect, the sequence of the target polynucleotide is known priorto modification, and compared to the sequence(s) of polynucleotide(s)that result from treatment with the compositions described herein. Inone aspect, the sequence of the target polynucleotide is not known priorto modification.

Any or all of the possible polynucleotide components of the reaction(e.g., guide polynucleotide, donor polynucleotide, optionally a caspolynucleotide) may be provided as part of a vector, a construct, alinearized or circularized plasmid, or as part of a chimeric molecule.Each component may be provided to the reaction mixture separately ortogether. In some aspects, one or more of the polynucleotide componentsare operably linked to a heterologous noncoding regulatory element thatregulates its expression.

The method for modification of a target polynucleotide comprisescombining the minimal elements into a reaction mixture comprising: a Casendonuclease (or variant, fragment, or other related molecule asdescribed above), a guide polynucleotide comprising a sequence that issubstantially complementary to, or selectively hybridizes to, the targetpolynucleotide sequence of the target polynucleotide, and a targetpolynucleotide for modification. In some aspects, the Cas endonucleaseis provided as a polypeptide. In some aspects, the Cas endonuclease isprovided as a cas polynucleotide. In some aspects, the guidepolynucleotide is provided as an RNA molecule, a DNA molecule, anRNA:DNA hybrid, or a polynucleotide molecule comprising achemically-modified nucleotide.

The storage buffer of any one of the components, or the reactionmixture, may be optimized for stability, efficacy, or other parameters.Additional components of the storage buffer or the reaction mixture mayinclude a buffer composition, Tris, EDTA, dithiothreitol (DTT),phosphate-buffered saline (PBS), sodium chloride, magnesium chloride,HEPES, glycerol, BSA, a salt, an emulsifier, a detergent, a chelatingagent, a redox reagent, an antibody, nuclease-free water, a proteinase,and/or a viscosity agent. In some aspects, the storage buffer orreaction mixture further comprises a buffer solution with at least oneof the following components: HEPES, MgCl2, NaCl, EDTA, a proteinase,Proteinase K, glycerol, nuclease-free water.

Incubation conditions will vary according to desired outcome. Thetemperature is preferably at least 10 degrees Celsius, between 10 and15, at least 15, between 15 and 17, at least 17, between 17 and 20, atleast 20, between 20 and 22, at least 22, between 22 and 25, at least25, between 25 and 27, at least 27, between 27 and 30, at least 30,between 30 and 32, at least 32, between 32 and 35, at least 35, at least36, at least 37, at least 38, at least 39, at least 40, or even greaterthan 40 degrees Celsius. The time of incubation is at least 1 minute, atleast 2 minutes, at least 3 minutes, at least 4 minutes, at least 5minutes, at least 6 minutes, at least 7 minutes, at least 8 minutes, atleast 9 minutes, at least 10 minutes, or even greater than 10 minutes.

The sequence(s) of the polynucleotide(s) in the reaction mixture priorto, during, or after incubation may be determined by any method known inthe art. In one aspect, modification of a target polynucleotide may beascertained by comparing the sequence(s) of the polynucleotide(s)purified from the reaction mixture to the sequence of the targetpolynucleotide prior to combining with the Cas enodnuclease.

Any one or more of the compositions disclosed herein, useful for invitro or in vivo polynucleotide detection, binding, and/or modification,may be comprised within a kit. A kit comprises a Cas endonuclease or apolynucleotide cas encoding such, optionally further comprising buffercomponents to enable efficient storage, and one or more additionalcompositions that enable the introduction of said Cas endonuclease orcas to a heterologous polynucleotide, wherein said Cas endonuclease orcas is capable of effecting a modification, addition, deletion, orsubstitution of at least one nucleotide of said heterologouspolynucleotide. In an additional aspect, a Cas endonuclease disclosedherein may be used for the enrichment of one or more polynucleotidetarget sequences from a mixed pool. In an additional aspect, a Casenodnuclease disclosed herein may be immobilized on a matrix for use inin vitro target polynucleotide detection, binding, and/or modification

Recombinant Constructs and Transformation of Cells

The disclosed guide polynucleotides, Cas endonucleases, polynucleotidemodification templates, donor DNAs, guide polynucleotide/Casendonuclease systems disclosed herein, and any one combination thereof,optionally further comprising one or more polynucleotide(s) of interest,can be introduced into a cell. Cells include, but are not limited to,human, non-human, animal, bacterial, fungal, insect, yeast,non-conventional yeast, and plant cells as well as plants and seedsproduced by the methods described herein.

Standard recombinant DNA and molecular cloning techniques used hereinare well known in the art and are described more fully in Sambrook etal., Molecular Cloning: A Laboratory Manual; Cold Spring HarborLaboratory: Cold Spring Harbor, N.Y. (1989). Transformation methods arewell known to those skilled in the art and are described infra.

Vectors and constructs include circular plasmids, and linearpolynucleotides, comprising a polynucleotide of interest and optionallyother components including linkers, adapters, regulatory or analysis. Insome examples a recognition site and/or target site can be comprisedwithin an intron, coding sequence, 5′ UTRs, 3′ UTRs, and/or regulatoryregions.

Components for Expression and Utilization of CRISPR-Cas Systems inProkaryotic and Eukaryotic Cells

The invention further provides expression constructs for expressing in aprokaryotic or eukaryotic cell/organism a guide RNA/Cas system that iscapable of recognizing, binding to, and optionally nicking, unwinding,or cleaving all or part of a target sequence.

In one embodiment, the expression constructs of the disclosure comprisea promoter operably linked to a nucleotide sequence encoding a Cas gene(or plant optimized, including a Cas endonuclease gene described herein)and a promoter operably linked to a guide RNA of the present disclosure.The promoter is capable of driving expression of an operably linkednucleotide sequence in a prokaryotic or eukaryotic cell/organism.

Nucleotide sequence modification of the guide polynucleotide, VT domainand/or CER domain can be selected from, but not limited to, the groupconsisting of a 5′ cap, a 3′ polyadenylated tail, a riboswitch sequence,a stability control sequence, a sequence that forms a dsRNA duplex, amodification or sequence that targets the guide poly nucleotide to asubcellular location, a modification or sequence that provides fortracking, a modification or sequence that provides a binding site forproteins, a Locked Nucleic Acid (LNA), a 5-methyl dC nucleotide, a2,6-Diaminopurine nucleotide, a 2′-Fluoro A nucleotide, a 2′-Fluoro Unucleotide; a 2′-O-Methyl RNA nucleotide, a phosphorothioate bond,linkage to a cholesterol molecule, linkage to a polyethylene glycolmolecule, linkage to a spacer 18 molecule, a 5′ to 3′ covalent linkage,or any combination thereof. These modifications can result in at leastone additional beneficial feature, wherein the additional beneficialfeature is selected from the group of a modified or regulated stability,a subcellular targeting, tracking, a fluorescent label, a binding sitefor a protein or protein complex, modified binding affinity tocomplementary target sequence, modified resistance to cellulardegradation, and increased cellular permeability.

A method of expressing RNA components such as gRNA in eukaryotic cellsfor performing Cas9-mediated DNA targeting has been to use RNApolymerase III (Pol III) promoters, which allow for transcription of RNAwith precisely defined, unmodified, 5′- and 3′-ends (DiCarlo et al.,Nucleic Acids Res. 41: 4336-4343; Ma et al., Mol. Ther. Nucleic Acids3:e161). This strategy has been successfully applied in cells of severaldifferent species including maize and soybean (US20150082478 published19 Mar. 2015). Methods for expressing RNA components that do not have a5′ cap have been described (WO2016/025131 published 18 Feb. 2016).

Various methods and compositions can be employed to obtain a cell ororganism having a polynucleotide of interest inserted in a target sitefor a Cas endonuclease. Such methods can employ homologous recombination(HR) to provide integration of the polynucleotide of interest at thetarget site. In one method described herein, a polynucleotide ofinterest is introduced into the organism cell via a donor DNA construct.

The donor DNA construct further comprises a first and a second region ofhomology that flank the polynucleotide of interest. The first and secondregions of homology of the donor DNA share homology to a first and asecond genomic region, respectively, present in or flanking the targetsite of the cell or organism genome.

The donor DNA can be tethered to the guide polynucleotide. Tethereddonor DNAs can allow for co-localizing target and donor DNA, useful ingenome editing, gene insertion, and targeted genome regulation, and canalso be useful in targeting post-mitotic cells where function ofendogenous HR machinery is expected to be highly diminished (Mali etal., 2013, Nature Methods Vol. 10: 957-963).

The amount of homology or sequence identity shared by a target and adonor polynucleotide can vary and includes total lengths and/or regionshaving unit integral values in the ranges of about 1-20 bp, 20-50 bp,50-100 bp, 75-150 bp, 100-250 bp, 150-300 bp, 200-400 bp, 250-500 bp,300-600 bp, 350-750 bp, 400-800 bp, 450-900 bp, 500-1000 bp, 600-1250bp, 700-1500 bp, 800-1750 bp, 900-2000 bp, 1-2.5 kb, 1.5-3 kb, 2-4 kb,2.5-5 kb, 3-6 kb, 3.5-7 kb, 4-8 kb, 5-10 kb, or up to and including thetotal length of the target site. These ranges include every integerwithin the range, for example, the range of 1-20 bp includes 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 and 20 bps. Theamount of homology can also be described by percent sequence identityover the full aligned length of the two polynucleotides which includespercent sequence identity at least of about 50%, 55%, 60%, 65%, 70%,71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%,85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%,between 98% and 99%, 99%, between 99% and 100%, or 100%. Sufficienthomology includes any combination of polynucleotide length, globalpercent sequence identity, and optionally conserved regions ofcontiguous nucleotides or local percent sequence identity, for examplesufficient homology can be described as a region of 75-150 bp having atleast 80% sequence identity to a region of the target locus. Sufficienthomology can also be described by the predicted ability of twopolynucleotides to specifically hybridize under high stringencyconditions, see, for example, Sambrook et al., (1989) Molecular Cloning:A Laboratory Manual, (Cold Spring Harbor Laboratory Press, NY); CurrentProtocols in Molecular Biology, Ausubel et al., Eds (1994) CurrentProtocols, (Greene Publishing Associates, Inc. and John Wiley & Sons,Inc.); and, Tijssen (1993) Laboratory Techniques in Biochemistry andMolecular Biology—Hybridization with Nucleic Acid Probes, (Elsevier, NewYork).

The structural similarity between a given genomic region and thecorresponding region of homology found on the donor DNA can be anydegree of sequence identity that allows for homologous recombination tooccur. For example, the amount of homology or sequence identity sharedby the “region of homology” of the donor DNA and the “genomic region” ofthe organism genome can be at least 50%, 55%, 60%, 65%, 70%, 75%, 80%,81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%,95%, 96%, 97%, 98%, 99% or 100% sequence identity, such that thesequences undergo homologous recombination

The region of homology on the donor DNA can have homology to anysequence flanking the target site. While in some instances the regionsof homology share significant sequence homology to the genomic sequenceimmediately flanking the target site, it is recognized that the regionsof homology can be designed to have sufficient homology to regions thatmay be further 5′ or 3′ to the target site. The regions of homology canalso have homology with a fragment of the target site along withdownstream genomic regions

In one embodiment, the first region of homology further comprises afirst fragment of the target site and the second region of homologycomprises a second fragment of the target site, wherein the first andsecond fragments are dissimilar.

Polynucleotides of Interest

Polynucleotides of interest are further described herein and includepolynucleotides reflective of the commercial markets and interests ofthose involved in the development of the crop. Crops and markets ofinterest change, and as developing nations open up world markets, newcrops and technologies will emerge also. In addition, as ourunderstanding of agronomic traits and characteristics such as yield andheterosis increase, the choice of genes for genetic engineering willchange accordingly.

General categories of polynucleotides of interest include, for example,genes of interest involved in information, such as zinc fingers, thoseinvolved in communication, such as kinases, and those involved inhousekeeping, such as heat shock proteins. More specific polynucleotidesof interest include, but are not limited to, genes involved in traits ofagronomic interest such as but not limited to: crop yield, grainquality, crop nutrient content, starch and carbohydrate quality andquantity as well as those affecting kernel size, sucrose loading,protein quality and quantity, nitrogen fixation and/or utilization,fatty acid and oil composition, genes encoding proteins conferringresistance to abiotic stress (such as drought, nitrogen, temperature,salinity, toxic metals or trace elements, or those conferring resistanceto toxins such as pesticides and herbicides), genes encoding proteinsconferring resistance to biotic stress (such as attacks by fungi,viruses, bacteria, insects, and nematodes, and development of diseasesassociated with these organisms).

Agronomically important traits such as oil, starch, and protein contentcan be genetically altered in addition to using traditional breedingmethods. Modifications include increasing content of oleic acid,saturated and unsaturated oils, increasing levels of lysine and sulfur,providing essential amino acids, and also modification of starch.Hordothionin protein modifications are described in U.S. Pat. Nos.5,703,049, 5,885,801, 5,885,802, and 5,990,389.

Polynucleotide sequences of interest may encode proteins involved inproviding disease or pest resistance. By “disease resistance” or “pestresistance” is intended that the plants avoid the harmful symptoms thatare the outcome of the plant-pathogen interactions. Pest resistancegenes may encode resistance to pests that have great yield drag such asrootworm, cutworm, European Corn Borer, and the like. Disease resistanceand insect resistance genes such as lysozymes or cecropins forantibacterial protection, or proteins such as defensins, glucanases orchitinases for antifungal protection, or Bacillus thuringiensisendotoxins, protease inhibitors, collagenases, lectins, or glycosidasesfor controlling nematodes or insects are all examples of useful geneproducts. Genes encoding disease resistance traits includedetoxification genes, such as against fumonisin (U.S. Pat. No.5,792,931); avirulence (avr) and disease resistance (R) genes (Jones etal. (1994) Science 266:789; Martin et al. (1993) Science 262:1432; andMindrinos et al. (1994) Cell 78:1089); and the like. Insect resistancegenes may encode resistance to pests that have great yield drag such asrootworm, cutworm, European Corn Borer, and the like. Such genesinclude, for example, Bacillus thuringiensis toxic protein genes (U.S.Pat. Nos. 5,366,892; 5,747,450; 5,736,514; 5,723,756; 5,593,881; andGeiser et al. (1986) Gene 48:109); and the like.

An “herbicide resistance protein” or a protein resulting from expressionof an “herbicide resistance-encoding nucleic acid molecule” includesproteins that confer upon a cell the ability to tolerate a higherconcentration of an herbicide than cells that do not express theprotein, or to tolerate a certain concentration of an herbicide for alonger period of time than cells that do not express the protein.Herbicide resistance traits may be introduced into plants by genescoding for resistance to herbicides that act to inhibit the action ofacetolactate synthase (ALS, also referred to as acetohydroxyacidsynthase, AHAS), in particular the sulfonylurea (UK: sulphonylurea) typeherbicides, genes coding for resistance to herbicides that act toinhibit the action of glutamine synthase, such as phosphinothricin orbasta (e.g., the bar gene), glyphosate (e.g., the EPSP synthase gene andthe GAT gene), HPPD inhibitors (e.g, the HPPD gene) or other such genesknown in the art. See, for example, U.S. Pat. Nos. 7,626,077, 5,310,667,5,866,775, 6,225,114, 6,248,876, 7,169,970, 6,867,293, and 9,187,762.The bar gene encodes resistance to the herbicide basta, the nptII geneencodes resistance to the antibiotics kanamycin and geneticin, and theALS-gene mutants encode resistance to the herbicide chlorsulfuron.

Furthermore, it is recognized that the polynucleotide of interest mayalso comprise antisense sequences complementary to at least a portion ofthe messenger RNA (mRNA) for a targeted gene sequence of interest.Antisense nucleotides are constructed to hybridize with thecorresponding mRNA. Modifications of the antisense sequences may be madeas long as the sequences hybridize to and interfere with expression ofthe corresponding mRNA. In this manner, antisense constructions having70%, 80%, or 85% sequence identity to the corresponding antisensesequences may be used. Furthermore, portions of the antisensenucleotides may be used to disrupt the expression of the target gene.Generally, sequences of at least 50 nucleotides, 100 nucleotides, 200nucleotides, or greater may be used.

In addition, the polynucleotide of interest may also be used in thesense orientation to suppress the expression of endogenous genes inplants. Methods for suppressing gene expression in plants usingpolynucleotides in the sense orientation are known in the art. Themethods generally involve transforming plants with a DNA constructcomprising a promoter that drives expression in a plant operably linkedto at least a portion of a nucleotide sequence that corresponds to thetranscript of the endogenous gene. Typically, such a nucleotide sequencehas substantial sequence identity to the sequence of the transcript ofthe endogenous gene, generally greater than about 65% sequence identity,about 85% sequence identity, or greater than about 95% sequenceidentity. See U.S. Pat. Nos. 5,283,184 and 5,034,323.

The polynucleotide of interest can also be a phenotypic marker. Aphenotypic marker is screenable or a selectable marker that includesvisual markers and selectable markers whether it is a positive ornegative selectable marker. Any phenotypic marker can be used.Specifically, a selectable or screenable marker comprises a DNA segmentthat allows one to identify, or select for or against a molecule or acell that comprises it, often under particular conditions. These markerscan encode an activity, such as, but not limited to, production of RNA,peptide, or protein, or can provide a binding site for RNA, peptides,proteins, inorganic and organic compounds or compositions and the like.

Examples of selectable markers include, but are not limited to, DNAsegments that comprise restriction enzyme sites; DNA segments thatencode products which provide resistance against otherwise toxiccompounds including antibiotics, such as, spectinomycin, ampicillin,kanamycin, tetracycline, Basta, neomycin phosphotransferase II (NEO) andhygromycin phosphotransferase (HPT)); DNA segments that encode productswhich are otherwise lacking in the recipient cell (e.g., tRNA genes,auxotrophic markers); DNA segments that encode products which can bereadily identified (e.g., phenotypic markers such as β-galactosidase,GUS; fluorescent proteins such as green fluorescent protein (GFP), cyan(CFP), yellow (YFP), red (RFP), and cell surface proteins); thegeneration of new primer sites for PCR (e.g., the juxtaposition of twoDNA sequence not previously juxtaposed), the inclusion of DNA sequencesnot acted upon or acted upon by a restriction endonuclease or other DNAmodifying enzyme, chemical, etc.; and, the inclusion of a DNA sequencesrequired for a specific modification (e.g., methylation) that allows itsidentification.

Additional selectable markers include genes that confer resistance toherbicidal compounds, such as sulphonylureas, glufosinate ammonium,bromoxynil, imidazolinones, and 2,4-dichlorophenoxyacetate (2,4-D). Seefor example, Acetolactase synthase (ALS) for resistance tosulfonylureas, imidazolinones, triazolopyrimidine sulfonamides,pyrimidinylsalicylates and sulphonylaminocarbonyl-triazolinones (Shanerand Singh, 1997, Herbicide Activity: Toxicol Biochem Mol Biol 69-110);glyphosate resistant 5-enolpyruvylshikimate-3-phosphate (EPSPS) (Sarohaet al. 1998, 1 Plant Biochemistry & Biotechnology Vol 7:65-72);

Polynucleotides of interest includes genes that can be stacked or usedin combination with other traits, such as but not limited to herbicideresistance or any other trait described herein. Polynucleotides ofinterest and/or traits can be stacked together in a complex trait locusas described in US20130263324 published 3 Oct. 2013 and inWO/2013/112686, published 1 Aug. 2013.

A polypeptide of interest includes any protein or polypeptide that isencoded by a polynucleotide of interest described herein.

Further provided are methods for identifying at least one plant cell,comprising in its genome, a polynucleotide of interest integrated at thetarget site. A variety of methods are available for identifying thoseplant cells with insertion into the genome at or near to the targetsite. Such methods can be viewed as directly analyzing a target sequenceto detect any change in the target sequence, including but not limitedto PCR methods, sequencing methods, nuclease digestion, Southern blots,and any combination thereof. See, for example, US20090133152 published21 May 2009. The method also comprises recovering a plant from the plantcell comprising a polynucleotide of interest integrated into its genome.The plant may be sterile or fertile. It is recognized that anypolynucleotide of interest can be provided, integrated into the plantgenome at the target site, and expressed in a plant.

Optimization of Sequences for Expression in Plants

Methods are available in the art for synthesizing plant-preferred genes.See, for example, U.S. Pat. Nos. 5,380,831, and 5,436,391, and Murray etal. (1989) Nucleic Acids Res. 17:477-498. Additional sequencemodifications are known to enhance gene expression in a plant host.These include, for example, elimination of: one or more sequencesencoding spurious polyadenylation signals, one or more exon-intronsplice site signals, one or more transposon-like repeats, and other suchwell-characterized sequences that may be deleterious to gene expression.The G-C content of the sequence may be adjusted to levels average for agiven plant host, as calculated by reference to known genes expressed inthe host plant cell. When possible, the sequence is modified to avoidone or more predicted hairpin secondary mRNA structures. Thus, “aplant-optimized nucleotide sequence” of the present disclosure comprisesone or more of such sequence modifications.

Expression Elements

Any polynucleotide encoding a Cas protein, other CRISPR systemcomponent, or other polynucleotide disclosed herein may be functionallylinked to a heterologous expression element, to facilitate transcriptionor regulation in a host cell. Such expression elements include but arenot limited to: promoter, leader, intron, and terminator. Expressionelements may be “minimal”—meaning a shorter sequence derived from anative source, that still functions as an expression regulator ormodifier. Alternatively, an expression element may be“optimized”—meaning that its polynucleotide sequence has been alteredfrom its native state in order to function with a more desirablecharacteristic in a particular host cell (for example, but not limitedto, a bacterial promoter may be “maize-optimized” to improve itsexpression in corn plants). Alternatively, an expression element may be“synthetic”—meaning that it is designed in silico and synthesized foruse in a host cell. Synthetic expression elements may be entirelysynthetic, or partially synthetic (comprising a fragment of anaturally-occurring polynucleotide sequence).

It has been shown that certain promoters are able to direct RNAsynthesis at a higher rate than others. These are called “strongpromoters”. Certain other promoters have been shown to direct RNAsynthesis at higher levels only in particular types of cells or tissuesand are often referred to as “tissue specific promoters”, or“tissue-preferred promoters” if the promoters direct RNA synthesispreferably in certain tissues but also in other tissues at reducedlevels.

A plant promoter includes a promoter capable of initiating transcriptionin a plant cell. For a review of plant promoters, see, Potenza et al.,2004, In vitro Cell Dev Biol 40:1-22; Porto et al., 2014, MolecularBiotechnology (2014), 56(1), 38-49.

Constitutive promoters include, for example, the core CaMV 35S promoter(Odell et al., (1985) Nature 313:810-2); rice actin (McElroy et al.,(1990) Plant Cell 2:163-71); ubiquitin (Christensen et al., (1989) PlantMol Biol 12:619-32; ALS promoter (U.S. Pat. No. 5,659,026) and the like.

Tissue-preferred promoters can be utilized to target enhanced expressionwithin a particular plant tissue. Tissue-preferred promoters include,for example, WO2013103367 published 11 Jul. 2013, Kawamata et al.,(1997) Plant Cell Physiol 38:792-803; Hansen et al., (1997) Mol GenGenet 254:337-43; Russell et al., (1997) Transgenic Res 6:157-68;Rinehart et al., (1996) Plant Physiol 112:1331-41; Van Camp et al.,(1996) Plant Physiol 112:525-35; Canevascini et al., (1996) PlantPhysiol 112:513-524; Lam, (1994) Results Probl Cell Differ 20:181-96;and Guevara-Garcia et al., (1993) Plant J 4:495-505. Leaf-preferredpromoters include, for example, Yamamoto et al., (1997) Plant J12:255-65; Kwon et al., (1994) Plant Physiol 105:357-67; Yamamoto etal., (1994) Plant Cell Physiol 35:773-8; Gotor et al., (1993) Plant J3:509-18; Orozco et al., (1993) Plant Mol Biol 23:1129-38; Matsuoka etal., (1993) Proc. Natl. Acad. Sci. USA 90:9586-90; Simpson et al.,(1958) EMBO J 4:2723-9; Timko et al., (1988) Nature 318:57-8.Root-preferred promoters include, for example, Hire et al., (1992) PlantMol Biol 20:207-18 (soybean root-specific glutamine synthase gene); Miaoet al., (1991) Plant Cell 3:11-22 (cytosolic glutamine synthase (GS));Keller and Baumgartner, (1991) Plant Cell 3:1051-61 (root-specificcontrol element in the GRP 1.8 gene of French bean); Sanger et al.,(1990) Plant Mol Biol 14:433-43 (root-specific promoter of A.tumefaciens mannopine synthase (MAS)); Bogusz et al., (1990) Plant Cell2:633-41 (root-specific promoters isolated from Parasponia andersoniiand Trema tomentosa); Leach and Aoyagi, (1991) Plant Sci 79:69-76 (A.rhizogenes rolC and rolD root-inducing genes); Teeri et al., (1989) EMBOJ 8:343-50 (Agrobacterium wound-induced TR1′ and TR2′ genes);VfENOD-GRP3 gene promoter (Kuster et al., (1995) Plant Mol Biol29:759-72); and rolB promoter (Capana et al., (1994) Plant Mol Biol25:681-91; phaseolin gene (Murai et al., (1983) Science 23:476-82;Sengopta-Gopalen et al., (1988) Proc. Natl. Acad. Sci. USA 82:3320-4).See also, U.S. Pat. Nos. 5,837,876; 5,750,386; 5,633,363; 5,459,252;5,401,836; 5,110,732 and 5,023,179.

Seed-preferred promoters include both seed-specific promoters activeduring seed development, as well as seed-germinating promoters activeduring seed germination. See, Thompson et al., (1989) BioEssays 10:108.Seed-preferred promoters include, but are not limited to, Cim1(cytokinin-induced message); cZ19B1 (maize 19 kDa zein); and milps(myo-inositol-1-phosphate synthase); and for example those disclosed inWO2000011177 published 2 Mar. 2000 and U.S. Pat. No. 6,225,529. Fordicots, seed-preferred promoters include, but are not limited to, beanβ-phaseolin, napin, β-conglycinin, soybean lectin, cruciferin, and thelike. For monocots, seed-preferred promoters include, but are notlimited to, maize 15 kDa zein, 22 kDa zein, 27 kDa gamma zein, waxy,shrunken 1, shrunken 2, globulin 1, oleosin, and nuc1. See also,WO2000012733 published 9 Mar. 2000, where seed-preferred promoters fromEND1 and END2 genes are disclosed.

Chemical inducible (regulated) promoters can be used to modulate theexpression of a gene in a prokaryotic and eukaryotic cell or organismthrough the application of an exogenous chemical regulator. The promotermay be a chemical-inducible promoter, where application of the chemicalinduces gene expression, or a chemical-repressible promoter, whereapplication of the chemical represses gene expression.Chemical-inducible promoters include, but are not limited to, the maizeIn2-2 promoter, activated by benzene sulfonamide herbicide safeners (DeVeylder et al., (1997) Plant Cell Physiol 38:568-77), the maize GSTpromoter (GST-II-27, WO1993001294 published 21 Jan. 1993), activated byhydrophobic electrophilic compounds used as pre-emergent herbicides, andthe tobacco PR-1a promoter (Ono et al., (2004) Biosci Biotechnol Biochem68:803-7) activated by salicylic acid. Other chemical-regulatedpromoters include steroid-responsive promoters (see, for example, theglucocorticoid-inducible promoter (Schena et al., (1991) Proc. Natl.Acad. Sci. USA 88:10421-5; McNellis et al., (1998) Plant J 14:247-257);tetracycline-inducible and tetracycline-repressible promoters (Gatz etal., (1991) Mol Gen Genet 227:229-37; U.S. Pat. Nos. 5,814,618 and5,789,156).

Pathogen inducible promoters induced following infection by a pathogeninclude, but are not limited to those regulating expression of PRproteins, SAR proteins, beta-1,3-glucanase, chitinase, etc.

A stress-inducible promoter includes the RD29A promoter (Kasuga et al.(1999) Nature Biotechnol. 17:287-91). One of ordinary skill in the artis familiar with protocols for simulating stress conditions such asdrought, osmotic stress, salt stress and temperature stress and forevaluating stress tolerance of plants that have been subjected tosimulated or naturally-occurring stress conditions.

Another example of an inducible promoter useful in plant cells, is theZmCAS1 promoter, described in US20130312137 published 21 Nov. 2013.

New promoters of various types useful in plant cells are constantlybeing discovered; numerous examples may be found in the compilation byOkamuro and Goldberg, (1989) In The Biochemistry of Plants, Vol. 115,Stumpf and Conn, eds (New York, N.Y.: Academic Press), pp. 1-82.

Introduction of System Components into a Cell

The methods and compositions described herein do not depend on aparticular method for introducing a sequence into an organism or cell,only that the polynucleotide or polypeptide gains access to the interiorof at least one cell of the organism. Introducing includes reference tothe incorporation of a nucleic acid into a eukaryotic or prokaryoticcell where the nucleic acid may be incorporated into the genome of thecell, and includes reference to the transient (direct) provision of anucleic acid, protein or polynucleotide-protein complex (PGEN, RGEN) tothe cell.

Methods for introducing polynucleotides or polypeptides or apolynucleotide-protein complex into cells or organisms are known in theart including, but not limited to, microinjection, electroporation,stable transformation methods, transient transformation methods,ballistic particle acceleration (particle bombardment), whiskersmediated transformation, Agrobacterium-mediated transformation, directgene transfer, viral-mediated introduction, transfection, transduction,cell-penetrating peptides, mesoporous silica nanoparticle (MSN)-mediateddirect protein delivery, topical applications, sexual crossing, sexualbreeding, and any combination thereof.

For example, the guide polynucleotide (guide RNA,crNucleotide+tracrNucleotide, guide DNA and/or guide RNA-DNA molecule)can be introduced into a cell directly (transiently) as a singlestranded or double stranded polynucleotide molecule. The guide RNA (orcrRNA+tracrRNA) can also be introduced into a cell indirectly byintroducing a recombinant DNA molecule comprising a heterologous nucleicacid fragment encoding the guide RNA (or crRNA+tracrRNA), operablylinked to a specific promoter that is capable of transcribing the guideRNA (crRNA+tracrRNA molecules) in said cell. The specific promoter canbe, but is not limited to, an RNA polymerase III promoter, which allowfor transcription of RNA with precisely defined, unmodified, 5′- and3′-ends (Ma et al., 2014, Mol. Ther. Nucleic Acids 3:e161; DiCarlo etal., 2013, Nucleic Acids Res. 41: 4336-4343; WO2015026887, published 26Feb. 2015). Any promoter capable of transcribing the guide RNA in a cellcan be used and includes a heat shock/heat inducible promoter operablylinked to a nucleotide sequence encoding the guide RNA.

The Cas endonuclease, such as the Cas endonuclease described herein, canbe introduced into a cell by directly introducing the Cas polypeptideitself (referred to as direct delivery of Cas endonuclease), the mRNAencoding the Cas protein, and/or the guide polynucleotide/Casendonuclease complex itself, using any method known in the art. The Casendonuclease can also be introduced into a cell indirectly byintroducing a recombinant DNA molecule that encodes the Casendonuclease. The endonuclease can be introduced into a cell transientlyor can be incorporated into the genome of the host cell using any methodknown in the art. Uptake of the endonuclease and/or the guidedpolynucleotide into the cell can be facilitated with a Cell PenetratingPeptide (CPP) as described in WO2016073433 published 12 May 2016. Anypromoter capable of expressing the Cas endonuclease in a cell can beused and includes a heat shock/heat inducible promoter operably linkedto a nucleotide sequence encoding the Cas endonuclease.

Direct delivery of a polynucleotide modification template into plantcells can be achieved through particle mediated delivery, and any otherdirect method of delivery, such as but not limiting to, polyethyleneglycol (PEG)-mediated transfection to protoplasts, whiskers mediatedtransformation, electroporation, particle bombardment, cell-penetratingpeptides, or mesoporous silica nanoparticle (MSN)-mediated directprotein delivery can be successfully used for delivering apolynucleotide modification template in eukaryotic cells, such as plantcells.

The donor DNA can be introduced by any means known in the art. The donorDNA may be provided by any transformation method known in the artincluding, for example, Agrobacterium-mediated transformation orbiolistic particle bombardment. The donor DNA may be present transientlyin the cell or it could be introduced via a viral replicon. In thepresence of the Cas endonuclease and the target site, the donor DNA isinserted into the transformed plant's genome.

Direct delivery of any one of the guided Cas system components can beaccompanied by direct delivery (co-delivery) of other mRNAs that canpromote the enrichment and/or visualization of cells receiving the guidepolynucleotide/Cas endonuclease complex components. For example, directco-delivery of the guide polynucleotide/Cas endonuclease components(and/or guide polynucleotide/Cas endonuclease complex itself) togetherwith mRNA encoding phenotypic markers (such as but not limiting totranscriptional activators such as CRC (Bruce et al. 2000 The Plant Cell12:65-79) can enable the selection and enrichment of cells without theuse of an exogenous selectable marker by restoring function to anon-functional gene product as described in WO2017070032 published 27Apr. 2017.

Introducing a guide RNA/Cas endonuclease complex described herein,(representing the cleavage ready cascade described herein) into a cellincludes introducing the individual components of said complex eitherseparately or combined into the cell, and either directly (directdelivery as RNA for the guide and protein for the Cas endonuclease andprotein subunits, or functional fragments thereof) or via recombinationconstructs expressing the components (guide RNA, Cas endonuclease,protein subunits, or functional fragments thereof). Introducing a guideRNA/Cas endonuclease complex (RGEN) into a cell includes introducing theguide RNA/Cas endonuclease complex as a ribonucleotide-protein into thecell. The ribonucleotide-protein can be assembled prior to beingintroduced into the cell as described herein. The components comprisingthe guide RNA/Cas endonuclease ribonucleotide protein (at least one Casendonuclease, at least one guide RNA, at least one protein subunit) canbe assembled in vitro or assembled by any means known in the art priorto being introduced into a cell (targeted for genome modification asdescribed herein).

Plant cells differ from human and animal cells in that plant cellscomprise a plant cell wall which may act as a barrier to the directdelivery of the ribonucleoproteins and/or of the direct delivery of thecomponents.

Direct delivery of a ribonucleoprotein comprising a Cas endonucleaseprotein and a guide RNA into plant cells may be achieved throughparticle mediated delivery (particle bombardment. Based on theexperiments described herein, a skilled artesian can now envision thatany other direct method of delivery, such as but not limiting to,polyethylene glycol (PEG)-mediated transfection to protoplasts,electroporation, cell-penetrating peptides, or mesoporous silicananoparticle (MSN)-mediated direct protein delivery, can be successfullyused for delivering RGEN ribonucleoproteins into plant cells.

Direct delivery of the ribonucleoprotein allows for genome editing at atarget site in the genome of a cell which can be followed by rapiddegradation of the complex, and only a transient presence of the complexin the cell. This transient presence of the complex may lead to reducedoff-target effects. In contrast, delivery of components (guide RNA, Cas9endonuclease) via plasmid DNA sequences can result in constantexpression from these plasmids which in some cases may promote offtarget cleavage (Cradick, T. J. et al. (2013) Nucleic Acids Res41:9584-9592; Fu, Y et al. (2014) Nat. Biotechnol. 31:822-826).

Direct delivery can be achieved by combining any one component of theguide RNA/Cas endonuclease complex, representing the cleavage readycascade described herein, (such as at least one guide RNA, at least oneCas protein, and optionally one additional protein), with a particledelivery matrix comprising a microparticle (such as but not limited toof a gold particle, tungsten particle, and silicon carbide whiskerparticle) (see also WO2017070032 published 27 Apr. 2017).

In one aspect the guide polynucleotide/Cas endonuclease complex, is acomplex wherein the guide RNA and Cas endonuclease protein forming theguide RNA/Cas endonuclease complex are introduced into the cell as RNAand protein, respectively.

In one aspect the guide polynucleotide/Cas endonuclease complex, is acomplex wherein the guide RNA and Cas endonuclease protein and the atleast one protein subunit of a Cascade forming the guide RNA/Casendonuclease complex are introduced into the cell as RNA and proteins,respectively.

In one aspect the guide polynucleotide/Cas endonuclease complex, is acomplex wherein the guide RNA and Cas endonuclease protein and the atleast one protein subunit of a Cascade forming the guide RNA/Casendonuclease complex (cleavage ready cascade) are preassembled in vitroand introduced into the cell as a ribonucleotide-protein complex.

Protocols for introducing polynucleotides, polypeptides orpolynucleotide-protein complexes (PGEN, RGEN) into eukaryotic cells,such as plants or plant cells are known and include microinjection(Crossway et al., (1986) Biotechniques 4:320-34 and U.S. Pat. No.6,300,543), meristem transformation (U.S. Pat. No. 5,736,369),electroporation (Riggs et al., (1986) Proc. Natl. Acad. Sci. USA83:5602-6, Agrobacterium-mediated transformation (U.S. Pat. Nos.5,563,055 and 5,981,840), whiskers mediated transformation (Ainley etal. 2013, Plant Biotechnology Journal 11:1126-1134; Shaheen A. and M.Arshad 2011 Properties and Applications of Silicon Carbide (2011),345-358 Editor(s): Gerhardt, Rosario. Publisher: InTech, Rijeka,Croatia. CODEN: 69PQBP; ISBN: 978-953-307-201-2), direct gene transfer(Paszkowski et al., (1984) EMBO J 3:2717-22), and ballistic particleacceleration (U.S. Pat. Nos. 4,945,050; 5,879,918; 5,886,244; 5,932,782;Tomes et al., (1995) “Direct DNA Transfer into Intact Plant Cells viaMicroprojectile Bombardment” in Plant Cell, Tissue, and Organ Culture:Fundamental Methods, ed. Gamborg & Phillips (Springer-Verlag, Berlin);McCabe et al., (1988) Biotechnology 6:923-6; Weissinger et al., (1988)Ann Rev Genet 22:421-77; Sanford et al., (1987) Particulate Science andTechnology 5:27-37 (onion); Christou et al., (1988) Plant Physiol87:671-4 (soybean); Finer and McMullen, (1991) In vitro Cell Dev Biol27P:175-82 (soybean); Singh et al., (1998) Theor Appl Genet 96:319-24(soybean); Datta et al., (1990) Biotechnology 8:736-40 (rice); Klein etal., (1988) Proc. Natl. Acad. Sci. USA 85:4305-9 (maize); Klein et al.,(1988) Biotechnology 6:559-63 (maize); U.S. Pat. Nos. 5,240,855;5,322,783 and 5,324,646; Klein et al., (1988) Plant Physiol 91:440-4(maize); Fromm et al., (1990) Biotechnology 8:833-9 (maize);Hooykaas-Van Slogteren et al., (1984) Nature 311:763-4; U.S. Pat. No.5,736,369 (cereals); Bytebier et al., (1987) Proc. Natl. Acad. Sci. USA84:5345-9 (Liliaceae); De Wet et al., (1985) in The ExperimentalManipulation of Ovule Tissues, ed. Chapman et al., (Longman, New York),pp. 197-209 (pollen); Kaeppler et al., (1990) Plant Cell Rep 9:415-8)and Kaeppler et al., (1992) Theor Appl Genet 84:560-6 (whisker-mediatedtransformation); D'Halluin et al., (1992) Plant Cell 4:1495-505(electroporation); Li et al., (1993) Plant Cell Rep 12:250-5; Christouand Ford (1995) Annals Botany 75:407-13 (rice) and Osjoda et al., (1996)Nat Biotechnol 14:745-50 (maize via Agrobacterium tumefaciens).

Alternatively, polynucleotides may be introduced into cells bycontacting cells or organisms with a virus or viral nucleic acids.Generally, such methods involve incorporating a polynucleotide within aviral DNA or RNA molecule. In some examples a polypeptide of interestmay be initially synthesized as part of a viral polyprotein, which islater processed by proteolysis in vivo or in vitro to produce thedesired recombinant protein. Methods for introducing polynucleotidesinto plants and expressing a protein encoded therein, involving viralDNA or RNA molecules, are known, see, for example, U.S. Pat. Nos.5,889,191, 5,889,190, 5,866,785, 5,589,367 and 5,316,931.

The polynucleotide or recombinant DNA construct can be provided to orintroduced into a prokaryotic and eukaryotic cell or organism using avariety of transient transformation methods. Such transienttransformation methods include, but are not limited to, the introductionof the polynucleotide construct directly into the plant.

Nucleic acids and proteins can be provided to a cell by any methodincluding methods using molecules to facilitate the uptake of anyone orall components of a guided Cas system (protein and/or nucleic acids),such as cell-penetrating peptides and nanocarriers. See alsoUS20110035836 published 10 Feb. 2011, and EP2821486A1 published 7 Jan.2015.

Other methods of introducing polynucleotides into a prokaryotic andeukaryotic cell or organism or plant part can be used, including plastidtransformation methods, and the methods for introducing polynucleotidesinto tissues from seedlings or mature seeds.

Stable transformation is intended to mean that the nucleotide constructintroduced into an organism integrates into a genome of the organism andis capable of being inherited by the progeny thereof. Transienttransformation is intended to mean that a polynucleotide is introducedinto the organism and does not integrate into a genome of the organismor a polypeptide is introduced into an organism. Transienttransformation indicates that the introduced composition is onlytemporarily expressed or present in the organism.

A variety of methods are available to identify those cells having analtered genome at or near a target site without using a screenablemarker phenotype. Such methods can be viewed as directly analyzing atarget sequence to detect any change in the target sequence, includingbut not limited to PCR methods, sequencing methods, nuclease digestion,Southern blots, and any combination thereof.

Cells and Organisms

The presently disclosed polynucleotides and polypeptides can beintroduced into a cell. Cells include, but are not limited to, human,non-human, animal, mammalian, bacterial, protist, fungal, insect, yeast,non-conventional yeast, and plant cells, as well as plants and seedsproduced by the methods described herein. In some aspects, the cell ofthe organism is a reproductive cell, a somatic cell, a meiotic cell, amitotic cell, a stem cell, or a pluripotent stem cell. Any cell from anyorganism may be used with the compositions and methods described herein,including monocot and dicot plants, and plant elements.

Animal Cells

The presently disclosed polynucleotides and polypeptides can beintroduced into an animal cell. Animal cells can include, but are notlimited to: an organism of a phylum including chordates, arthropods,mollusks, annelids, cnidarians, or echinoderms; or an organism of aclass including mammals, insects, birds, amphibians, reptiles, orfishes. In some aspects, the animal is human, mouse, C. elegans, rat,fruit fly (Drosophila spp.), zebrafish, chicken, dog, cat, guinea pig,hamster, chicken, Japanese ricefish, sea lamprey, pufferfish, tree frog(e.g., Xenopus spp.), monkey, or chimpanzee. Particular cell types thatare contemplated include haploid cells, diploid cells, reproductivecells, neurons, muscle cells, endocrine or exocrine cells, epithelialcells, muscle cells, tumor cells, embryonic cells, hematopoietic cells,bone cells, germ cells, somatic cells, stem cells, pluripotent stemcells, induced pluripotent stem cells, progenitor cells, meiotic cells,and mitotic cells. In some aspects, a plurality of cells from anorganism may be used.

The compositions and methods described herein may be used to edit thegenome of an animal cell in various ways. In one aspect, it may bedesirable to delete one or more nucleotides. In another aspect, it maybe desirable to insert one or more nucleotides. In one aspect, it may bedesirable to replace one or more nucleotides. In another aspect, it maybe desirable to modify one or more nucleotides via a covalent ornon-covalent interaction with another atom or molecule.

Genome modification may be used to effect a genotypic and/or phenotypicchange on the target organism. Such a change is preferably related to animproved phenotype of interest or a physiologically-importantcharacteristic, the correction of an endogenous defect, or theexpression of some type of expression marker. In some aspects, thephenotype of interest or physiologically-important characteristic isrelated to the overall health, fitness, or fertility of the animal, theecological fitness of the animal, or the relationship or interaction ofthe animal with other organisms in its environment. In some aspects, thephenotype of interest or physiologically-important characteristic isselected from the group consisting of: improved general health, diseasereversal, disease modification, disease stabilization, diseaseprevention, treatment of parasitic infections, treatment of viralinfections, treatment of retroviral infections, treatment of bacterialinfections, treatment of neurological disorders (for example but notlimited to: multiple sclerosis), correction of endogenous geneticdefects (for example but not limited to: metabolic disorders,Achondroplasia, Alpha-1 Antitrypsin Deficiency, AntiphospholipidSyndrome, Autism, Autosomal Dominant Polycystic Kidney Disease, Barthsyndrome, Breast cancer, Charcot-Marie-Tooth, Colon cancer, Cri du chat,Crohn's Disease, Cystic fibrosis, Dercum Disease, Down Syndrome, DuaneSyndrome, Duchenne Muscular Dystrophy, Factor V Leiden Thrombophilia,Familial Hypercholesterolemia, Familial Mediterranean Fever, Fragile XSyndrome, Gaucher Disease, Hemochromatosis, Hemophilia,Holoprosencephaly, Huntington's disease, Klinefelter syndrome, Marfansyndrome, Myotonic Dystrophy, Neurofibromatosis, Noonan Syndrome,Osteogenesis Imperfecta, Parkinson's disease, Phenylketonuria, PolandAnomaly, Porphyria, Progeria, Prostate Cancer, Retinitis Pigmentosa,Severe Combined Immunodeficiency (SCID), Sickle cell disease, SkinCancer, Spinal Muscular Atrophy, Tay-Sachs, Thalassemia,Trimethylaminuria, Turner Syndrome, Velocardiofacial Syndrome, WAGRSyndrome, and Wilson Disease), treatment of innate immune disorders (forexample but not limited to: immunoglobulin subclass deficiencies),treatment of acquired immune disorders (for example but not limited to:AIDS and other HIV-related disorders), treatment of cancer, as well astreatment of diseases, including rare or “orphan” conditions, that haveeluded effective treatment options with other methods.

Cells that have been genetically modified using the compositions ormethods described herein may be transplanted to a subject for purposessuch as gene therapy, e.g. to treat a disease, or as an antiviral,antipathogenic, or anticancer therapeutic, for the production ofgenetically modified organisms in agriculture, or for biologicalresearch.

Plant Cells and Plants

Examples of monocot plants that can be used include, but are not limitedto, corn (Zea mays), rice (Oryza sativa), rye (Secale cereale), sorghum(Sorghum bicolor, Sorghum vulgare), millet (e.g., pearl millet(Pennisetum glaucum), proso millet (Panicum miliaceum), foxtail millet(Setaria italica), finger millet (Eleusine coracana)), wheat (Triticumspecies, for example Triticum aestivum, Triticum monococcum), sugarcane(Saccharum spp.), oats (Avena), barley (Hordeum), switchgrass (Panicumvirgatum), pineapple (Ananas comosus), banana (Musa spp.), palm,ornamentals, turfgrasses, and other grasses.

Examples of dicot plants that can be used include, but are not limitedto, soybean (Glycine max), Brassica species (for example but not limitedto: oilseed rape or Canola) (Brassica napus, B. campestris, Brassicarapa, Brassica. juncea), alfalfa (Medicago sativa), tobacco (Nicotianatabacum), Arabidopsis (Arabidopsis thaliana), sunflower (Helianthusannuus), cotton (Gossypium arboreum, Gossypium barbadense), and peanut(Arachis hypogaea), tomato (Solanum lycopersicum), potato (Solanumtuberosum.

Additional plants that can be used include safflower (Carthamustinctorius), sweet potato (Ipomoea batatus), cassava (Manihotesculenta), coffee (Coffea spp.), coconut (Cocos nucifera), citrus trees(Citrus spp.), cocoa (Theobroma cacao), tea (Camellia sinensis), banana(Musa spp.), avocado (Persea americana), fig (Ficus casica), guava(Psidium guajava), mango (Mangifera indica), olive (Olea europaea),papaya (Carica papaya), cashew (Anacardium occidentale), macadamia(Macadamia integrifolia), almond (Prunus amygdalus), sugar beets (Betavulgaris), vegetables, ornamentals, and conifers.

Vegetables that can be used include tomatoes (Lycopersicon esculentum),lettuce (e.g., Lactuca sativa), green beans (Phaseolus vulgaris), limabeans (Phaseolus limensis), peas (Lathyrus spp.), and members of thegenus Cucumis such as cucumber (C. sativus), cantaloupe (C.cantalupensis), and musk melon (C. melo). Ornamentals include azalea(Rhododendron spp.), hydrangea (Macrophylla hydrangea), hibiscus(Hibiscus rosasanensis), roses (Rosa spp.), tulips (Tulipa spp.),daffodils (Narcissus spp.), petunias (Petunia hybrida), carnation(Dianthus caryophyllus), poinsettia (Euphorbia pulcherrima), andchrysanthemum.

Conifers that may be used include pines such as loblolly pine (Pinustaeda), slash pine (Pinus elliotii), ponderosa pine (Pinus ponderosa),lodgepole pine (Pinus contorta), and Monterey pine (Pinus radiata);Douglas fir (Pseudotsuga menziesii); Western hemlock (Tsuga canadensis);Sitka spruce (Picea glauca); redwood (Sequoia sempervirens); true firssuch as silver fir (Abies amabilis) and balsam fir (Abies balsamea); andcedars such as Western red cedar (Thuja plicata) and Alaska yellow cedar(Chamaecyparis nootkatensis).

In certain embodiments of the disclosure, a fertile plant is a plantthat produces viable male and female gametes and is self-fertile. Such aself-fertile plant can produce a progeny plant without the contributionfrom any other plant of a gamete and the genetic material comprisedtherein. Other embodiments of the disclosure can involve the use of aplant that is not self-fertile because the plant does not produce malegametes, or female gametes, or both, that are viable or otherwisecapable of fertilization.

The present disclosure finds use in the breeding of plants comprisingone or more introduced traits, or edited genomes.

A non-limiting example of how two traits can be stacked into the genomeat a genetic distance of, for example, 5 cM from each other is describedas follows: A first plant comprising a first transgenic target siteintegrated into a first DSB target site within the genomic window andnot having the first genomic locus of interest is crossed to a secondtransgenic plant, comprising a genomic locus of interest at a differentgenomic insertion site within the genomic window and the second plantdoes not comprise the first transgenic target site. About 5% of theplant progeny from this cross will have both the first transgenic targetsite integrated into a first DSB target site and the first genomic locusof interest integrated at different genomic insertion sites within thegenomic window. Progeny plants having both sites in the defined genomicwindow can be further crossed with a third transgenic plant comprising asecond transgenic target site integrated into a second DSB target siteand/or a second genomic locus of interest within the defined genomicwindow and lacking the first transgenic target site and the firstgenomic locus of interest. Progeny are then selected having the firsttransgenic target site, the first genomic locus of interest and thesecond genomic locus of interest integrated at different genomicinsertion sites within the genomic window. Such methods can be used toproduce a transgenic plant comprising a complex trait locus having atleast 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,15, 16, 17, 19, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 or more transgenic targetsites integrated into DSB target sites and/or genomic loci of interestintegrated at different sites within the genomic window. In such amanner, various complex trait loci can be generated.

Aspects of the Invention

In some aspects, the invention comprises:

Aspect 1: A method of increasing the frequency of homology-directedrepair at double strand break of a target polynucleotide sequence,comprising: (a) introducing a site-specific first double-strand break tothe target polynucleotide sequence, that results in a modified targetsequence; (b) providing to the target polynucleotide sequence a Casendonuclease and a guide RNA that binds to the modified sequence,thereby creating a second double-strand break; and (c) allowing thesecond double-strand break to undergo repair.

Aspect 2: The method of Aspect 1, wherein steps (b) and (c) are repeatedfor 1, 2, 3, 4, 5, or more subsequent cleavage(s) and repair(s) step(s).

Aspect 3: The method of Aspect 1, wherein a plurality of guide RNAmolecules are provided.

Aspect 4: The method of Aspect 3, wherein the plurality of guide RNAmolecules are provided simultaneously.

Aspect 5: The method of Aspect 3, wherein the plurality of guide RNAmolecules are provided sequentially.

Aspect 6: A method of increasing the frequency of homology-directedrepair at double strand break of a target polynucleotide sequence,comprising: (a) introducing a first double-strand break to the firsttarget polynucleotide sequence, that results in a second targetpolynucleotide sequence; (b) introducing to the second targetpolynucleotide sequence a Cas endonuclease and a recombinant DNAconstruct comprising a guide RNA DNA sequence and a spacer DNA sequence,wherein the guide RNA DNA sequence is complementary to the second targetpolynucleotide sequence created by the repair of the first double-strandbreak, creating a second double-strand break; (c) allowing the seconddouble-strand break to undergo repair, resulting in a third targetpolynucleotide sequence; (d) editing the DNA sequence of the spacer ofthe recombinant DNA construct so that it is substantially identical tothe third target polynucleotide sequence; (e) expressing from therecombinant DNA construct a third guide polynucleotide that issubstantially complementary to the third target polynucleotide sequence;(f) allowing the third guide polynucleotide and the Cas endonuclease tointroduce a third double-strand break at the target polynucleotidesequence.

Aspect 7: The method of Aspect 6, wherein steps (d) through (f) arerepeated for 1, 2, 3, 4, 5, or more subsequent cleavage(s) andrepair(s), wherein subsequent guide RNA(s) are designed in response tothe previous cleavage repair mutation(s).

Aspect 8: A method of increasing the frequency of homology-directedrepair at double strand break of a target polynucleotide sequence,comprising: (a) introducing a first double-strand break to the firsttarget polynucleotide sequence; (b) introducing a single-strand nick toa polynucleotide adjacent to the first double-strand break; and (c)allowing the first double-strand break to undergo repair.

Aspect 9: The method of Aspect 8, wherein two nicks are created adjacentto, and flanking the double-strand break of (a).

Aspect 10: The method of Aspect 8, wherein the nick in step (b) iscreated within 100 basepairs of the double-strand break of (a).

Aspect 11: The method of Aspect 1, 6, or 8, wherein at least one Casendonuclease is provided as a protein molecule.

Aspect 12: The method of Aspect 1, 6, or 8, wherein at least one guideRNA is provided as an mRNA molecule.

Aspect 13: The method of Aspect 1, 6, or 8, further comprisingintroducing a heterologous polynucleotide after step (a).

Aspect 14: The method of Aspect 11, wherein the heterologouspolynucleotide is a template for repair of the double-strand break.

Aspect 15: The method of Aspect 11, wherein the heterologouspolynucleotide is a donor DNA molecule for insertion at thedouble-strand break site.

Aspect 16: The method of Aspect 1, 6, or 8, wherein the firstdouble-strand break is repaired by NHEJ.

Aspect 17: The method of Aspect 1, 6, or 8, wherein the frequency of HDRof subsequent double-strand-break repairs is greater than the rate ofHDR of the first double-strand-break repair.

Aspect 18: The method of Aspect 1, 6, or 8, wherein the frequency of HDRof a double-strand-break repair at a target polynucleotide site isgreater than the rate of NHEJ at that same site.

Aspect 19: The method of Aspect 1 or 6, wherein the ratio of HDR to NHEJincreases in at least one subsequent DSB repair step as compared to thefirst DSB repair step with no nicks introduced.

Aspect 20: The method of Aspect 1, 6, or 8, wherein the cumulativefrequency of HDR is greater than that observed for single cleavage.

Aspect 21: The method of Aspect 1, 6, or 8, wherein the cumulativepercentage of HDR is at least 10 times greater than that observed forsingle cleavage.

Aspect 22: The method of Aspect 1, 6, or 8, wherein the fraction of HRreads relative to the number of total mutant reads (NHEJ+HR) is at least10 times greater than that observed for single cleavage.

Aspect 23: The method of Aspect 1, 6, or 8, wherein the percent of HRreads relative to the number of total mutant reads (NHEJ+HR) is at least3%.

Aspect 24: The method of Aspect 1, 6, or 8, wherein the firstdouble-strand break is created by a Cas endonuclease and guide RNAcomplex.

Aspect 25: The method of Aspect 1, 6, or 8, wherein the method isperformed in vitro.

Aspect 26: The method of Aspect 1, 6, or 8, wherein the method isperformed in vivo.

Aspect 27: The method of Aspect 1, 6, or 8, wherein the method isperformed in a cell.

Aspect 28: The method of Aspect 26, wherein the method is performed inan animal cell.

Aspect 29: The method of Aspect 26, wherein the method is performed in aplant cell.

Aspect 30: The method of Aspect 29, wherein the plant cell is obtainedor derived from a plant selected from the group consisting of: maize,rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass,switchgrass, soybean, Brassica, alfalfa, sunflower, cotton, tobacco,peanut, potato, tobacco, Arabidopsis, vegetable, and safflower.

Aspect 31: The method of Aspect 28 or 29, wherein the method confers abenefit to the cell or to an organism comprising said cell.

Aspect 32: The method of Aspect 31, wherein the benefit is selected fromthe group consisting of: improved health, improved growth, improvedfertility, improved fecundity, improved environmental tolerance,improved vigor, improved disease resistance, improved disease tolerance,improved tolerance to a heterologous molecule, improved fitness,improved physical characteristic, greater mass, increased production ofa biochemical molecule, decreased production of a biochemical molecule,upregulation of a gene, downregulation of a gene, upregulation of abiochemical pathway, downregulation of a biochemical pathway,stimulation of cell reproduction, and suppression of cell reproduction.

Aspect 33: The method of Aspect 31, wherein the cell is a plant cell,and wherein the benefit is a trait of agronomic interest of a plantcomprising said cell or a progeny cell thereof, selected from the groupconsisting of: disease resistance, drought tolerance, heat tolerance,cold tolerance, salinity tolerance, metal tolerance, herbicidetolerance, improved water use efficiency, improved nitrogen utilization,improved nitrogen fixation, pest resistance, herbivore resistance,pathogen resistance, yield improvement, health enhancement, improvedfertility, vigor improvement, growth improvement, photosyntheticcapability improvement, nutrition enhancement, altered protein content,altered oil content, increased biomass, increased shoot length,increased root length, improved root architecture, modulation of ametabolite, modulation of the proteome, increased seed weight, alteredseed carbohydrate composition, altered seed oil composition, alteredseed protein composition, altered seed nutrient composition; as comparedto an isoline plant not comprising said target site modification or ascompared to the plant prior to the modification of said target site insaid plant cell.

Aspect 34: The method of Aspect 31, wherein the cell is an animal cell,and wherein the benefit is a phenotype of physiological interest of anorganism comprising said animal cell or a progeny or progeny cellthereof, selected from the group consisting of: improved health,improved nutritional status, reduced disease impact, disease stasis,disease reversal, improved fertility, improved vigor, improved mentalcapacity, improved organism growth, improved weight gain, weight loss,modulation of an endocrine system, modulation of an exocrine system,reduced tumor size, reduced tumor mass, stimulated cell growth, reducedcell growth, production of a metabolite, production of a hormone,production of an immune cell, and stimulation of cell production.

While the invention has been particularly shown and described withreference to a preferred embodiment and various alternate embodiments,it will be understood by persons skilled in the relevant art thatvarious changes in form and details can be made therein withoutdeparting from the spirit and scope of the invention. For instance,while the particular examples below may illustrate the methods andembodiments described herein using a specific plant, the principles inthese examples may be applied to any plant. Therefore, it will beappreciated that the scope of this invention is encompassed by theembodiments of the inventions recited herein and in the specificationrather than the specific examples that are exemplified below. All citedpatents, applications, and publications referred to in this applicationare herein incorporated by reference in their entirety, for allpurposes, to the same extent as if each were individually andspecifically incorporated by reference.

EXAMPLES

The following are examples of specific embodiments of some aspects ofthe invention. The examples are offered for illustrative purposes only,and are not intended to limit the scope of the invention in any way.Efforts have been made to ensure accuracy with respect to numbers used(e.g., amounts, temperatures, etc.), but some experimental error anddeviation should, of course, be allowed for.

Example 1: Co-Delivery of RNA Guided Cas9 and DNA Repair Template intoPlants

In this example, methods to deliver both a DNA repair template and DNAexpression cassettes encoding a Cas9 protein and single guide RNA(sgRNA) into a plant cell are described. Various transformation methodsknown to be effective in plants, including particle-mediated delivery,Agrobacterium-mediated transformation, PEG-mediated delivery, andelectroporation may be used.

Particle-Mediated Delivery of DNA Expression Cassettes-Rapid TransientExperiments

Plant optimized Streptococcus pyogenes (Spy) Cas9 and guide RNAexpression cassettes (Svitashev et al. (2015) Plant Physiology.169:931-945) were co-delivered with homology-directed repair (HDR)capable donor DNA template using particle-mediated transformation in thepresence of BBM and WUS2 genes as described in Ananiev, E. et al. (2009)Chromosoma. 118:157-77 and Svitashev et al. (2015) Plant Physiology.169:931-945. Briefly, DNA expression cassettes and DNA repair templatesdesigned as shown in FIG. 1A-E were co-precipitated onto 0.6 μM (averagesize) gold particles utilizing TransIT-2020. Next, the DNA coated goldparticles were pelleted by centrifugation, washed with absolute ethanoland re-dispersed by sonication. Following sonication, 10₁1.1 of the DNAcoated gold particles were loaded onto a macrocarrier and air dried.Next, biolistic transformation of 8 to 10-day-old immature maize embryos(IMEs) was performed using a PDS-1000/He Gun (Bio-Rad) with a 425 lb persquare inch rupture disc. Since particle gun transformation can behighly variable, a visual marker DNA expression cassette encoding afluorescent protein was also co-delivered to aid in the selection ofevenly transformed IMEs for transient experiments.

Particle-Mediated Delivery of DNA Expression Cassettes-StableTransformation Experiments

Transformation of maize immature embryos using particle delivery isperformed as follows. Media recipes follow below.

The ears are husked and surface sterilized in 30% Clorox bleach plus0.5% Micro detergent for 20 minutes, and rinsed two times with sterilewater. The immature embryos are isolated and placed embryo axis sidedown (scutellum side up), 25 embryos per plate, on 560Y medium for 4hours and then aligned within the 2.5-cm target zone in preparation forbombardment. Alternatively, isolated embryos are placed on 560L(Initiation medium) and placed in the dark at temperatures ranging from26° C. to 37° C. for 8 to 24 hours prior to placing on 560Y for 4 hoursat 26° C. prior to bombardment as described above.

Plasmids comprising plant optimized Streptococcus pyogenes (Spy) Cas9and guide RNA expression cassettes (Svitashev et al. (2015) PlantPhysiology. 169:931-945) were co-delivered with homology-directed repair(HDR) capable DNA template with plasmids containing the developmentalgenes ODP2 (AP2 domain transcription factor ODP2 (Ovule developmentprotein 2); US20090328252 A1) and Wushel (US2011/0167516) and anappropriate selectable marker. In the case of Hi-Type II experiments,the phosphinothricin acetyltransferase (PAT) gene encoding resistance tobialaphos (Frame, B. et al. (2000) In Vitro Cellular & DevelopmentalBiology. 36:21-29) was used while for inbred experiments with ED85E theneomycin phosphotransferase (NPTII) gene encoding resistance toaminoglycoside antibiotics (Fraley, R. et al. (1986) CRC CriticalReviews in Plant Science. 4:1-45) was used.

The plasmids and donor DNA are precipitated onto 0.6 micrometer (averagediameter) gold pellets using a water-soluble cationic lipid transfectionreagent as follows. DNA solution is prepared on ice using 1 μg ofplasmid DNA and optionally other constructs for co-bombardment such as50 ng (0.5 μl) of each plasmid containing the developmental genes ODP2(AP2 domain transcription factor ODP2 (Ovule development protein 2);US20090328252 A1) and Wuschel. To the pre-mixed DNA, 20 μl of preparedgold particles (15 mg/ml) and 5 μl of a water-soluble cationic lipidtransfection reagent is added in water and mixed carefully. Goldparticles are pelleted in a microfuge at 10,000 rpm for 1 min andsupernatant is removed. The resulting pellet is carefully rinsed with100 ml of 100% EtOH without resuspending the pellet and the EtOH rinseis carefully removed. 105 μl of 100% EtOH is added and the particles areresuspended by brief sonication. Then, 10 μl is spotted onto the centerof each macrocarrier and allowed to dry about 2 minutes beforebombardment.

Alternatively, the plasmids and DNA of interest are precipitated onto1.1 μm (average diameter) tungsten pellets using a calcium chloride(CaCl2) precipitation procedure by mixing 100 μl prepared tungstenparticles in water, 10 μl (1 μg) DNA in Tris EDTA buffer (1 μg totalDNA), 100 μl 2.5 M CaCl2, and 10 μl 0.1 M spermidine. Each reagent isadded sequentially to the tungsten particle suspension, with mixing. Thefinal mixture is sonicated briefly and allowed to incubate underconstant vortexing for 10 minutes. After the precipitation period, thetubes are centrifuged briefly, liquid is removed, and the particles arewashed with 500 ml 100% ethanol, followed by a 30 second centrifugation.Again, the liquid is removed, and 105 μl of 100% ethanol is added to thefinal tungsten particle pellet. For particle gun bombardment, thetungsten/DNA particles are briefly sonicated. 10 μl of the tungsten/DNAparticles is spotted onto the center of each macrocarrier, after whichthe spotted particles are allowed to dry about 2 minutes beforebombardment.

The sample plates are bombarded at level #4 with a Biorad Helium Gun.All samples receive a single shot at 450 PSI, with a total of tenaliquots taken from each tube of prepared particles/DNA.

Following bombardment, the embryos are incubated on 560P (maintenancemedium) for 12 to 48 hours at temperatures ranging from 26 C to 37 C,and then placed at 26 C. After 5 to 7 days the embryos are transferredto selection medium containing an appropriate selective agent (e.g.Bialaphos), and sub-cultured every 2 weeks at 26 C. After approximately10 weeks of selection, selection-resistant callus clones are transferredto 288J medium to initiate plant regeneration. Following somatic embryomaturation (2-4 weeks), well-developed somatic embryos are transferredto medium for germination and transferred to a lighted culture room.Approximately 7-10 days later, developing plantlets are transferred to272V hormone-free medium in tubes for 7-10 days until plantlets are wellestablished. Plants are then transferred to inserts in flats (equivalentto a 2.5″ pot) containing potting soil and grown for 1 week in a growthchamber, subsequently grown an additional 1-2 weeks in the greenhouse,then transferred to Classic 600 pots (1.6 gallon) and grown to maturity.Plants are monitored and scored for transformation efficiency, and/ormodification of regenerative capabilities.

Initiation medium (560L) comprises 4.0 g/l N6 basal salts (SIGMAC-1416), 1.0 ml/l Eriksson's Vitamin Mix (1000× SIGMA-1511), 0.5 mg/lthiamine HCl, 20.0 g/l sucrose, 1.0 mg/l 2,4-D, and 2.88 g/l L-proline(brought to volume with D-I H₂O following adjustment to pH 5.8 withKOH); 2.0 g/l Gelrite (added after bringing to volume with D-I H2O); and8.5 mg/l silver nitrate (added after sterilizing the medium and coolingto room temperature).

Maintenance medium (560P) comprises 4.0 g/l N6 basal salts (SIGMAC-1416), 1.0 ml/l Eriksson's Vitamin Mix (1000× SIGMA-1511), 0.5 mg/lthiamine HCl, 30.0 g/l sucrose, 2.0 mg/l 2,4-D, and 0.69 g/l L-proline(brought to volume with D-I H₂O following adjustment to pH 5.8 withKOH); 3.0 g/l Gelrite (added after bringing to volume with D-I H2O); and0.85 mg/l silver nitrate (added after sterilizing the medium and coolingto room temperature).

Bombardment medium (560Y) comprises 4.0 g/l N6 basal salts (SIGMAC-1416), 1.0 ml/l Eriksson's Vitamin Mix (1000× SIGMA-1511), 0.5 mg/lthiamine HCl, 120.0 g/l sucrose, 1.0 mg/l 2,4-D, and 2.88 g/l L-proline(brought to volume with D-I H₂O following adjustment to pH 5.8 withKOH); 2.0 g/l Gelrite (added after bringing to volume with D-I H2O); and8.5 mg/l silver nitrate (added after sterilizing the medium and coolingto room temperature).

Selection medium (560R) comprises 4.0 g/l N6 basal salts (SIGMA C-1416),1.0 ml/l Eriksson's Vitamin Mix (1000× SIGMA-1511), 0.5 mg/l thiamineHCl, 30.0 g/l sucrose, and 2.0 mg/l 2,4-D (brought to volume with D-IH₂O following adjustment to pH 5.8 with KOH); 3.0 g/l Gelrite (addedafter bringing to volume with D-I H2O); and 0.85 mg/l silver nitrate and3.0 mg/l bialaphos (both added after sterilizing the medium and coolingto room temperature).

Plant regeneration medium (288J) comprises 4.3 g/l MS salts (GIBCO11117-074), 5.0 ml/l MS vitamins stock solution (0.100 g nicotinic acid,0.02 g/l thiamine HCL, 0.10 g/l pyridoxine HCL, and 0.40 g/l glycinebrought to volume with polished D-I H₂O) (Murashige and Skoog (1962)Physiol. Plant. 15:473), 100 mg/l myo-inositol, 0.5 mg/l zeatin, 60 g/lsucrose, and 1.0 ml/l of 0.1 mM abscisic acid (brought to volume withpolished D-I H₂O after adjusting to pH 5.6); 3.0 g/l Gelrite (addedafter bringing to volume with D-I H2O); and 1.0 mg/l indoleacetic acidand 3.0 mg/l bialaphos (added after sterilizing the medium and coolingto 60° C.).

Hormone-free medium (272V) comprises 4.3 g/l MS salts (GIBCO 11117-074),5.0 ml/l MS vitamins stock solution (0.100 g/l nicotinic acid, 0.02 g/lthiamine HCL, 0.10 g/l pyridoxine HCL, and 0.40 g/l glycine brought tovolume with polished D-I H2O), 0.1 g/l myo-inositol, and 40.0 g/lsucrose (brought to volume with polished D-I H₂O after adjusting pH to5.6); and 6 g/l bacto-agar (added after bringing to volume with polishedD-I H2O), sterilized and cooled to 60° C.

Example 2: Detection of Cas9 Induced Chromosomal DSB Repair in Plants

In this example, methods to detect non-homologous end-joining (NHEJ) andhomology-directed repair (HDR) repair of Cas9 induced double-strandbreaks (DSBs) are described.

Determining HDR Frequencies by Deep Sequencing

In one method, chromosomal DNA repair may be monitored transiently(Svitashev et al. (2015) Plant Physiology. 169:931-945 and Karvelis, T.et al. (2015) Genome Biology. 16:253 (Methods Section: In plantamutation detection)) and in regenerated plantlets by targetingsequencing. Briefly, 2 days after particle-mediated transformation, the20-30 most evenly transformed immature maize embryos (IMEs) areharvested based on their immunofluorescence. Total genomic DNA isextracted and the region surrounding the intended target site is PCRamplified with Phusion® HighFidelity PCR Master Mix (New EnglandBiolabs, M0531L) adding on the sequences necessary for amplicon-specificbarcodes and Illumina sequencing using “tailed” primers through tworounds of PCR and deep sequenced. The resulting reads are then examinedfor the presence of NHEJ or HDR mutations at the expected site ofcleavage by comparison to control experiments where the single guide RNA(sgRNA) transcriptional cassette is omitted from the transformation. Thefraction of HDR reads perfectly matching the changes introduced by theDNA repair template (relative to the total mutant reads (NHEJ plus HDR))are then calculated to provide a comparison between different biologicalreps and treatments.

Determining HDR frequencies by PCR

In another method, PCR can be used to detect a change introduced as aresult of HDR (Svitashev et al. (2015) Plant Physiology. 169:931-945 andShi et al. (2017) Plant Biotechnology Journal. 15: 2017-216). In thisapproach, one oligonucleotide is designed to be specific to the HDRmodification while a second oligonucleotide is oriented in the flankinggenomic DNA (outside of the region with homology to the repair template)to permit the PCR amplification of DSBs that have undergone HDR. Then,the presence of an amplification is used to detect tissue that has beenrepaired with the donor template. Both agarose gel-based as well asquantitative PCR (qPCR) methods, can be used to detect HDR. In the caseof qPCR, an estimation of the proportion of the tissue that contains theHDR modification can be assayed.

In a variation of the PCR-based method, PCR primers can also be designedto amplify across the HDR induced alteration. With this approach, repairusing the donor template should result in a size difference (eitherbigger or smaller) at the DSB so that the size of the PCR amplicon canbe used to detect repair with the donor template. In another method, acombination of PCR-based methods can be used in combination to assessthe presence, frequency, and intactness of a HDR edit.

Example 3: Recurrent Cleavage Enhances Cas Endonuclease-MediatedHomology-Directed Repair (HDR)

In this example, methods to enhance the homology-directed repair (HDR)of chromosomal DNA double strand breaks (DSBs) generated by aCRISPR-associated (Cas) endonuclease are described.

Cellular repair of Cas induced DSBs utilizes the non-homologousend-joining (NHEJ) and homology-directed DNA repair pathways (Hsu, P. D.et al. (2014) Cell. 157:1262-1278). NHEJ repair may result in theimprecise insertion or deletion (indel) of DNA base pairs (bps) at achromosomal DNA target site and is useful for disrupting (knocking-out)gene expression. In contrast, homology-directed repair (HDR) offers ahighly precise method to introduce desired changes into DNA using anexogenously supplied DNA repair template (DRT) (Capecchi, M. R. (1989)Science. 244:1288-1292). The NHEJ pathway is typically the mostprevalent DSB repair outcome making the recovery of HDR-mediatedalterations infrequent (Capecchi, M. R. (1989) Science. 244:1288-1292).To increase the frequency of HDR repair, we devised a strategy thatpermits Cas9 to be programmed to repeatedly cleave a DNA target (FIG.2A). Since the cellular repair of Cas9 induced DSBs are non-random andreproducible (see, for example, Van Overbeek, M. et al. (2016) Mol.Cell. 63:633-646), our approach targeted not only the first intendedsite for cleavage but also sites comprised of the most prevalent NHEJrepair outcomes. By doing this, DSBs were generated in a step-wiserecurrent manner (e.g. the intended target sequence was cleaved, and,once repaired to a prevalent NHEJ outcome, it was cleaved again)providing multiple opportunities for HDR-mediated DNA repair to occur.Since Cas9 has high affinity for and is slow to release its cleavedsubstrate (Richardson, C. et al. (2016) Nat. Biotechnol. 34:339-344),our approach also extended the cellular window of time in which DSBsoccur and are repaired.

Improvements to HDR using the recurrent (RC) approach were evaluatedwhen a combination of the initial target (iTarget) and either one(NHEJ1) or two (NHEJ1 and 2) of the most prevalent NHEJ mutations werealso targeted for cleavage. First, the most prevalent mutationsresulting from cleavage and cellular repair for 4 Zea mays Cas9 targetsites (Table 1) were determined.

TABLE 1 Zea mays genomic target sites examined for improvements to HDR.PAM represents protospacer adjacent motif. Target Site Zea mays sgRNATarget Sequence 3′ Name Genotype Location (AGPv4) (SEQ ID NO) PAM LIG1Hi-Type II Chr2: 4236455-4236434 1 AGG MS26 Hi-Type II Chr1:14704689-14704667 2 AGG MS45 Hi-Type II Chr9: 143632609-143632630 3 AGGTS45 ED85E Chr1: 15409712-15409735 4 AGG LIG1 NHEJ1 Hi-Type II N/A 29AGG LIG1 NHEJ2 Hi-Type II N/A 30 AGG MS26 NHEJ1 Hi-Type II N/A 31 AGGMS26 NHEJ2 Hi-Type II N/A 32 AGG MS45 NHEJ1 Hi-Type II N/A 33 AGG MS45NHEJ2 Hi-Type II N/A 34 AGG TS45 NHEJ1 ED85E N/A 35 AGG TS45 NHEJ2 ED85EN/A 36 AGG

Particle gun transformation was performed as described in Example 1 andanalysis of the most prevalent mutations resulting from target cleavageand repair by NHEJ was accomplished using Illumina deep sequencing asdescribed in Example 2. Transformations were performed in duplicate fromimmature embryos isolated from different sources to account for bothtechnical and biological variation except for the MS26 site. For thistarget, 3 replicates were performed as the second most prevalentmutation was unclear from the first two replicates. The top 5 mostprevalent NHEJ mutations for LIG1, MS26, MS45, and TS45 are shown inFIGS. 7A-D.

Next, SpyCas9 single guide RNA (sgRNA) DNA plasmid expression cassettescapable of directing Cas9 cleavage at the two most prevalent NHEJ repairoutcomes, NHEJ1 and NHEJ2, for LIG1, MS26, MS45, and TS45 wereconstructed. Depending on the type of NHEJ mutation, the SpyCas9 sgRNAspacer length was adjusted to maintain a length of ˜20 nt (FIGS. 7A-D).To promote robust U6 expression, an additional G was incorporated ontothe 5′ end of the sgRNAs targeting the most prevalent NHEJs if theirspacer terminated in any other nucleotide. The NHEJ1 and NHEJ2 sequencestargeted for cleavage are listed in Table 1.

Since LIG1 exhibited a strong preference for a single NHEJ mutation, itwas used as a test case. Plasmid DRTs were engineered to introduce asmall change (5 bp) adjacent to the Cas9 cut-site in order to easilydetect HDR by deep sequencing. DRTs with and without flanking targetsites were also tested (FIGS. 1D and 1E). Flanking target sites werecomprised of the most prevalent NHEJ mutation (NHEJ1). Particle guntransformation and analysis by Illumina deep sequencing was performed asdescribed in Examples 1 and 2, respectively. Transformations wereperformed in duplicate from immature embryos isolated from differentsources to account for both technical and biological variation. As shownin FIG. 8, all RC Cas9 experiments (iTarget+NHEJ1; iTarget+NHEJ1+NHEJ2)enhanced HDR (with % HDR representing the fraction of HDR reads relativeto all mutant (HDR+NHEJ) sequence reads). Notably, the highest frequencyof HDR was recovered by targeting the iTarget as well as the two mostprevalent NHEJ mutation types (NHEJ1 and NHEJ2). Relative to experimentstargeting only the iTarget (and without flanking DNA targets), the foldincrease in HDR was nearly 20-fold (FIG. 8).

Next, the targets flanking the DRT were evaluated for their impact onHDR. Experiments were conducted as described above with and withoutflanking targets and the content of the flanking target was varied(FIGS. 9A and B). In some instances, the flanking target was homologousto the iTarget (FIG. 9A) while in others it comprised targets withpartial homologous to the iTarget (NHEJ1) (FIG. 9B). In general,flanking the DRT with a target site increased the frequency of HDR andfrequencies were further elevated using RC Cas9 (FIG. 10). Notably, HDRoutcomes were the highest when RC Cas9 and a DRT with partiallyhomologous flanking target sites were used (FIG. 10).

To confirm these findings, HDR frequencies were assessed at two otherZea mays Cas9 sites, MS26 and MS45. Additionally, another control wasadded to further examine the finding that DRTs with partially homologousflanking targets (comprised of NHEJ1) provided the largest increase inHDR. This was an additional DRT that contained flanking targets that hadno homology to the iTarget (FIG. 9C). As shown in FIGS. 11 and 12, thecombination of RC Cas9 plus the NHEJ1 flanking the DRT as described forthe LIG1 consistently produced higher frequencies of HDR across all 3target sites. On average, RC Cas9 with a repair template flanked by asite partially homologous to the iTarget provided a 28-fold increase inHDR frequencies relative to experiments when only the iTarget (withoutDRT flanking sequences) was cleaved (FIG. 13). Also, as observed at theLIG1 target, flanking targets that contained homology with the iTarget(either complete or partial homology) enhanced HDR with the highestfrequencies being recovered for targets with partial homology (NHEJ1) tothe iTarget (FIG. 13). To determine statistically probabilities amongtreatments, a one-side T-test was performed assuming 95% confidence.Those groups with a probability (p) value less than 0.05 were consideredas being statistically different.

To validate our findings in regenerated plant tissue, particle-mediatedtransformation was performed with Hi-Type II immature embryos in thepresence of BBM and Wus genes and transformed plantlets were regeneratedas described in Example 1. LIG and MS26 target sites were used andexperiments included a control to establish improvements made to HDR.The control comprised an experiment when only the iTarget was cleaved(without targets flanking the DRT). Prior to plant regeneration, Hi-TypeII transformed callus from the MS26 experiment was sampled and examinedfor HDR as described in Example 2. Briefly, this was accomplished usinga forward primer, designed to specifically amplify from within the HDRedit, and a reverse primer, placed adjacent to the genomic region withhomology to the repair template (FIG. 14A). The MS26 HDR edit wasdetectable in almost all of the transformed tissues compared to only afew instances in the control (FIGS. 14B, C). After plant regeneration,the frequency of HDR was also significantly enhanced (FIG. 15). At theMS26 site, an approximately 6-fold increase in HDR was observed. Noteonly plants that were likely to contain a germline HDR edit (defined ashaving >30% of all sequence reads with the HDR edit) were used in thecalculation. For the LIG target, an even greater improvement in HDR wasobserved where no edits were detected in the control and approximately6% of the plantlets exhibited HDR. Interestingly, of the plants thatcontained edits generated with RC Cas9, approximately ⅓ were predictedto contain HDR alterations on both alleles (bi-allelic) (FIG. 15).

Next, to confirm that the RC approach not only works for themodification of a few base pairs but also for the insertion of largerDNA fragments (e.g. transgenes), a DRT comprised of a DNA expressioncassette capable of expressing NPTII was constructed (FIG. 16). Next,particle gun transformation was performed with inbred ED85E as describedin Example 1. Similar to previous experiments, the DRT was flanked withthe NHEJ1 Target and the iTarget as well as the two most prevalent NHEJmutations were targeted for cleavage. Enhancements to HDR were measuredusing experiments when only the iTarget was cleaved (without targetsflanking the DRT). Only plants that contained an intact germline HDRinsertion (with intactness being defined by PCR amplifying across the3.6 kb insertion and homology arms and germline probability beingassessed by qPCR of the elements found within insertion) were consideredas positive for HDR. Experiments were also setup cleaving only theiTarget using DRTs flanked with the iTarget site. As shown in FIG. 17,the improvement to HDR over the control was 5.2- and 8.2-fold foriTarget only (with iTargets flanking the DRT) and RC (with NHEJ1 targetsflanking the DRT) treatments, respectively. Similar to that observed atthe LIG1 and MS26 sites, the proportion of HDR edits for RC Cas9 thatwere determined to be bi-allelic was significantly enhanced compared tothe other treatments (3.3% for RC Cas9, 0.7% for iTarget (with DRTs withiTarget sites), and 0.0% for iTarget (without sites flanking the DRT).Moreover, the proportion of clean plants, defined as having no other DNAintegrated (e.g. Cas9, sgRNA, BBM, Wus) except the NPTII expressioncassette at the target site, was also elevated for RC Cas9 (FIG. 17).

Altogether, the recurrent Cas9 approach prescribed here significantly(>6-10 fold) enhanced homology-directed repair (HDR). Our method isexemplified both in rapid transient experiments as well as inregenerated Zea mays plants.

Example 4: Recursive Cleavage Enhances Cas Endonuclease-MediatedHomology-Directed Repair (HDR)

FIG. 2B shows an illustration with an example of improving frequency ofHDR via recursive cleavage of a target polynucleotide site.

Briefly, a double-strand-break as created, repaired, and recursivelycleaved by any method or composition, for example but not limited to aCas endonuclease and guide RNA. Briefly, a DSB inducing agent (e.g., Casendonuclease and first guide RNA) recognize, bind to, and cleave atarget polynucleotide. The first guide RNA is provided as a DNA sequenceon a plasmid that further comprises a spacer sequence. In some aspects,the DNA encoding the gRNA is operably linked to a regulatory expressionelement. A first double-strand-break is created, and repaired. Thecomposition of the repaired target polynucleotide is used as the basisof a mutation generated by Cas editing of the spacer on the plasmidcomprising the gRNA DNA and spacer. The mutated spacer compositiondirects the generation of a second gRNA that is complementary to thesequence of the repaired targeted polynucleotide of the first DSB, and asecond double-strand break is induced at the target site by the Casendonuclease and the second gRNA. The cycle may then repeat, withsequence of the newly repaired second DSB then being used as a templatefor the composition of a third gRNA that is complementary to thesequence of the repaired second DSB polynucleotide, and so forth. Inthis manner a loop of DSB generation and repair occurs, with eachsubsequent repair after the first having a higher probability of repairvia HDR than NHEJ, as compared to the mechanism of the first repair. Theprocess may stop or proceed at a reduced rate by any of a number ofmethods, including but not limited to: titrated reagent availability,mutation induced in the region of the gRNA DNA expression construct thatrenders the expression cassette or transcribed gRNA to benon-functional, an external factor that may optionally be inducible orrepressible, or via the introduction of another molecule.

Example 5: Introduction of Nick(s) Adjacent to the DSB Enhances CasEndonuclease-Mediated Homology-Directed Repair

A novel method achieving the enhancement of homologous recombinationafter the creation of a double strand break in DNA is described herein.The platform was based on the induction of a double strand break at aspecific target site in chromosomal maize DNA using an RNA-proteincomplex of a Cas endonuclease (e.g., B. laterosporus, or S. aureus Cas9)and a guide RNA that binds to the Cas endonuclease and had sequencescomplementary to the target chromosomal DNA. Concurrently nicks werecreated, one on each side of the double strand break using a, ormultiple, nickase(s) (for example, Cas molecules with their own guideRNAs that are not identical to the Cas9 that made the double strandbreak). The specific orientation of the nicks on specific strands of theDNA was designed and to generate 3′ overhangs at the double strand breakwhich were recombinogenic. FIG. 3 describes this orientation and thecurrent Cas9s used to generate these data. Although specific targetsites and DSB agents (e.g., B. laterosporus Cas9), and nicking agents(S. pyogenes Cas9 molecules that had been mutated with a D10A mutationto render the enzyme a nickase instead of a DSB agent) were used in someof these particular experiments, it is contemplated that other moleculescould be used to accomplish creating the DSB and/or the flanking nicksites of the DSB site; further these results are targetsite-independent. In the following example, “Spy” means S. pyogenes,“Blat” means B. laterosporus, “Sa” or “S. aureus” means S. aureus.“Spyn” means. S. pyogenes Cas9 nickase, etc.

The flanking nickase guides for each target sites are listed in Table 2,along with the PAM sequences for each of the Cas9 nickases. The repairtemplates used for each experiment are listed in Table 3. The sequencesfor each target site are given in Table 4.

TABLE 2Flanking nickase guides and PAM sequences for the “Nick-DSB-Nick” experimentsSpacer (DNA) Spacer SEQID PAM flanking Spy Guides for M545 BLAT targetMS45-BLAT2-Spyn1 GAACACAAACCACACGAATC 49 TGG MS45-BLAT2-Spyn2GTTGTCGGGGAAGCCCGGC 50 AGG MS45-BLAT2-Spyn3 GCCGTTGGAGCGCACGTTGTC 51 GGGMS45-BLAT2-Spyn4 GAACTGGCCCCTGCCGT 52 TGG MS45-BLAT2-Spyn5(G)TCCGAGACAACAAACTGC 53 AGG flanking BLAT Guides for named Spy sitesM16-BLAT-L (g)tgtgtctttgccatatgtt 54 tagtcaaa M16-BLAT-R(g)agaacctaataaatttc 55 cattcaaa NLB18_8_BLAT-L gtatccagcagtttacgcatatgg56 tcactcaa NLB18_8_BLAT-R gtgcaagccgtagccagacg 57 atgtcgaa NLB_8_BLAT-Lgcacttgtgttggacaataaa 58 tcaccaaa NLB_8_BLAT-R ggtcctaccatcgtcttctaa 59tcctcaaa flanking Spy Guides for named Sa sites MS45-Spy-Lgtcgccgacgcgtacta 60 cgg MS45-Spy-R (g)ctttcagcagagatacag 61 aggMS26-Spy-L gcgtgacgatgatgttgg 62 cgg MS26-Spy-R gagccaggcatccaaggc 63agg flanking Sa Guides for named Spy sites TS50-Sa-Lgtgatcagagtcgtgatttg 64 aagaat TS50-Sa-R gtgcctggtttcttgcactacc 65tcgggt TS45-Sa-L gtgagactaatgaaaatcacat 66 ctggat TS45-Sa-R(g)aaagaactaattaagcttcga 67 tagggt

TABLE 3 Repair templates. Repair Template SEQID SequenceN1545-BN1m Sense 37ATTCATAGACTTGAACACAAACCACACGAATCTTTCCTTCACCAGGATAATGAGGTAATGGCTCCTGCAGGGGAAGGTCCAAGAGCGGGCGAGGTAGAGGTGTTCGCGAAAATGCCGGGCTTCAACGACAACGTGCGCTAAAACGGCAG N1545-BN 1m 38CTGCCGTTTTAGCGCACGTTGTCGTTGAAGCCCGGCATTTTCGCGAACACCTCTACC AntiSenseTCGCCCGCTCTTGGACCTTCCCCTGCAGGAGCCATTACCTCATTATCCTGGTGAAGGAAAGATTCGTGTGGTTTGTGTTCAAGTCTATGAAT M16-BN-S 39gagcattctctaatttagtttttttctgtatctaccatggcggaagtCCTGCAGGggatggtttatgattaggatgattatacaggagtaccttgtactatctttcta M16-BN-AS 40tagaaagatagtacaaggtactcctgtataatcatcctaatcataaaccatccCCTGCAGGacttccgccatggtagatacagaaaaaaactaaattagagaatgctc NLB-CR8-BN-S 41ttccctttcacttaaagacagactaaatacccgggcactcctcagacagcgagctatgccgctggattcatacacctgtgacaactgtatcttacaagccgaag CCTGCAGGagacagtccttcatctttttttcagatgcagct NLB-CR8-BN-AS 42agctgcatctgaaaaaaagatgaaggactgtctCCTGCAGGcttcggcttgtaagatacagttgtcacaggtgtatgaatccagcggcatagctcgctgtctgaggagtgcccgggtatttagtctgtctttaagtgaaagggaa MS45-Sa-BN-S 43tccgtcgcgagggaagccgacggggaccccatccggttcgcgaacgacctcgatgtgCCTGCAGGcacaggaatggatccgtattcttcactgacacg MS45-Sa-BN-AS 44cgtgtcagtgaagaatacggatccattcctgtgCCTGCAGGcacatcgaggtcgttcgcgaaccggatggggtccccgtcggcttccctcgcgacgga TS50-BN-S 45ggtagcatgtgtgaatcccatttctcctagaaccacactgaacaaCCTGCAGGcaacggcagagttgtttgtatctagatctcgaccgatccttcacc TS50-BN-AS 46ggtgaaggatcggtcgagatctagatacaaacaactctgccgttgCCTGCAGGttgttcagtgtggttctaggagaaatgggattcacacatgctacc TS45_Sense 47ATCTGGATCCTGAAATCGGCGTCGTAACCTACAAGGCCACGGACTGGATTAGATAGTGGTCCATGGTGCATAATGAGGAT

*GAGG CCTGCAGG ATGA

*GAGCAATCAT TGTTCAAGACATGATGCAAAGCTAGAAAACTTTGATTGTGGCCGTCCTAATTGTGAAGTTTAGGCCGGGG TS45_Antisense 48CCCCGGCCTAAACTTCACAATTAGGACGGCCACAATCAAAGTTTTCTAGCTTTGCATCATGTCTTGAACAATGATTGCTCTTCATCCTGCAGGCCTCCATCCTCATTATGCACCATGGACCACTATCTAATCCAGTCCGTGGCCTTGTAGGTTACGACGCCGATTTCAGGAT CCAGAT Bold &Italic font with an * = introduced SNP. Bold & Underlined font = 8 bpinsertion making up the SbfI target site. Large bold type font= cleavage position of S. pyogenes Cas9.

Table 4. Vectors and SEQIDs for the target site

The first experiment tested a “Spy-Blat-Spy” nickase-DSB agent-nickasestrategy at the maize MS45-BLAT2 target site (FIG. 4A), with threedifferent combinations of nickase target sites. Co-delivered with theBlatCas9 DSB agent were Cas9 nickases (SpynCas9), different guidesaccording to FIG. 4A, and a double strand DNA repair template oligo of180 bp length. The repair template included insertion sequences that wasobserved as a specific template based change via deep sequencing of bulkDNA from the embryos. Sequences changes were detected in the targetsequence. FIG. 4A depicts the target sites of the nickase SpynCas9guides on the target sequence, labeled 1, 2, 3, and 4, respectively.FIGS. 4B and 4C show the frequencies of mutations at the target sitewhen: BLATCas9 and SpynCas9 were delivered but no guides were provided(no guide); when BLAT and its cognate guide were used by itself (BLATonly); and when BLAT and its cognate guide was co-delivered withSpynCas9 and specified Spy guides. Mutant reads for each are shown inFIG. 4B and HDR frequencies for each are shown in FIG. 4C.

The next experiment tested a “Spy-Blat-Spy” nickase-DSB agent-nickasestrategy, at an additional three combinations of different nickasetarget sites. The MS45-BLAT2 target site for a double strand break viathe BlatCas9 was targeted. Co-delivered were Cas9 nickases (SpynCas9),different guides according to FIG. 4D, and a double strand DNA repairtemplate oligo of 180 bp length. The repair template included insertionsequences that was observed as a specific template based change via deepsequencing of bulk DNA from the embryos. Sequences changes were detectedin the target sequence. FIG. 4D depicts the target sites of the nickaseSpynCas9 guides on the target sequence, labeled 3, 4, and 5,respectively. FIGS. 4E and 4F show the frequencies of mutations at thetarget site for: BlatCas9 and nSpynCas9 were delivered but no guideswere provided (no guide); when Blat and its cognate guide were used byitself (Blat only); when Blat and its cognate guide were co-deliveredwith nickase and specified paired nick site Spyn guides (5/3 and 5/4pairs); when SpynCas9 nickase and nick site guides (5/3 pair) and weredelivered without Blat or the Blat guide; and when BlatCas9, Blat guide,and SpynGuides were delivered without co-delivering the nickase SpynCas9. Mutant reads for each are shown in FIG. 4E and HDR frequencies foreach are shown in FIG. 4F. Enhancement of SDN2-type HDR events wasobserved when the nicks were between 40 to 110 basepairs from the doublestrand break.

Next, a “Blat-Spy-Blat” nickase-DSB agent-nickase strategy at maizetarget site M16 (FIG. 5A, SEQID NO:68) in one germplasm line, for fivereplicates. Average mutant reads for are shown in FIG. 5B and HDRfrequencies shown in FIG. 5C, for: SpyCas9 delivered with no guides;SpyCas9 delivered with its cognate gRNA; SpyCas9, its cognate gRNA,nickase Blat and its guides; and Blat nickase with its guides only, withno SpyCas9 or its cognate gRNA.

The Blat-Spy-Blat strategy was also performed at a different maizetarget site (NLB-CR8) (FIG. 5D, SEQID NO:69) in a different germplasmline, for five replicates. Average mutant reads for are shown in FIG. 5Eand HDR frequencies shown in FIG. 5F, for: SpyCas9 delivered with noguides; SpyCas9 delivered with its cognate gRNA; SpyCas9, its cognategRNA, nickase Blat and its guides; and Blat nickase with its guidesonly, with no SpyCas9 or its cognate gRNA.

Next, a “Spy-Sa-Spy” strategy was performed at the MS45 maize genomictarget site (FIG. 6A, SEQID NO:70), for four replicates. Average mutantreads for are shown in FIG. 6B and HDR frequencies shown in FIG. 6C,for: SaCas9 delivered with no guides; SaCas9 delivered with its cognategRNA; SaCas9, its cognate gRNA, nickase SpynCas9 and its guides; andSpynCas9 nickase with its guides only, with no SaCas9 or its cognategRNA.

Next, a “Sa-Spy-Sa” strategy was performed at the TS50 maize genomictarget site (FIG. 6D, SEQID NO:71), for three replicates. Average mutantreads for are shown in FIG. 6E and HDR frequencies shown in FIG. 6F,for: SpyCas9 delivered with no guides; SpyCas9 delivered with itscognate gRNA; SpyCas9, its cognate gRNA, nickase SaCas9 and its guides;and SaCas9 nickase with its guides only, with no SpyCas9 or its cognategRNA.

The “Sa-Spy-Sa” strategy was performed at a different maize genomictarget site (TS45) (FIG. 6G, SEQID NO:72), for two replicates. Averagemutant reads for are shown in FIG. 6H and HDR frequencies shown in FIG.6I, for: SpyCas9 delivered with no guides; SpyCas9 delivered with itscognate gRNA; SpyCas9, its cognate gRNA, nickase SaCas9 and its guides;and SaCas9 nickase with its guides only, with no SpyCas9 or its cognategRNA.

These results demonstrate that flanking a double-strand break with nickscan result in a higher frequency of homology-directed repair outcomes.

Taken together, these examples demonstrate that the frequency of HDR ata DSB site may be increased with different strand cleavage strategies,including: recurrent cutting at a DSB site, recursive cutting at a DSBsite, and flanking the DSB with nicks. Further improvement may beachieved by flanking a donor or template with sequences homologous tothe initial target site, or with sequence homologous to the one of thesubsequent sequences created by NHEJ repair of the DSB created at theinitial or subsequent target sites.

We claim:
 1. A method of increasing the frequency of homology-directedrepair at double strand break of a target polynucleotide in a cell,comprising: (a) introducing a first double-strand break to the targetpolynucleotide, that results in a modified target sequence; (b)providing to the modified target sequence a Cas endonuclease and a guideRNA that binds to the modified target sequence and creates a seconddouble-strand break; and (c) allowing the second double-strand break toundergo repair; wherein the repair of the second double-strand breakresults in at least one nucleotide modification as compared to thesequence of the target polynucleotide.
 2. The method of claim 1, whereinsteps (b) and (c) are repeated for 1, 2, 3, 4, 5, or more subsequentcleavage(s) and repair(s) step(s).
 3. The method of claim 1, wherein aplurality of guide RNA molecules are provided.
 4. The method of claim 3,wherein the plurality of guide RNA molecules are providedsimultaneously.
 5. The method of claim 3, wherein the plurality of guideRNA molecules are provided sequentially.
 6. A method of increasing thefrequency of homology-directed repair at double strand break of a targetpolynucleotide, comprising: (a) introducing a first double-strand breakto the target polynucleotide, that results in a first modified sequence;(b) introducing to the first modified sequence a Cas endonuclease and arecombinant DNA construct comprising a first guide RNA DNA sequence anda spacer DNA sequence, wherein the first guide RNA DNA sequence iscomplementary to the first modified sequence, wherein the Casendonuclease and the guide RNA create a second double-strand break; (c)allowing the second double-strand break to undergo repair, resulting ina second modified sequence; (d) editing the DNA sequence of the spacerof the recombinant DNA construct of (b) so that it is substantiallyidentical to the second modified sequence; (e) expressing from therecombinant DNA construct a second guide polynucleotide that issubstantially complementary to the second modified sequence; (f)allowing the second guide polynucleotide and the Cas endonuclease tointroduce a third double-strand break at the target polynucleotidesequence, and allowing the third double-strand break to repair; whereinthe repair of the third double-strand break results in at least onenucleotide modification as compared to the sequence of the targetpolynucleotide.
 7. The method of claim 6, wherein steps (d) through (f)are repeated for 1, 2, 3, 4, 5, or more subsequent cleavage(s) andrepair(s), wherein subsequent guide RNA(s) are designed in response tothe previous cleavage repair mutation(s).
 8. A method of increasing thefrequency of homology-directed repair at double strand break of a targetpolynucleotide, comprising: (a) introducing a double-strand break to thetarget polynucleotide; (b) introducing a single-strand nick to asequence adjacent to the double-strand break; and (c) allowing thedouble-strand break to undergo repair; wherein the repair of thedouble-strand break results in at least one nucleotide modification ascompared to the sequence of the target polynucleotide.
 9. The method ofclaim 8, wherein two nicks are created adjacent to, and flanking thedouble-strand break of (a).
 10. The method of claim 8, wherein thedouble-strand break is created by a Cas endonuclease.
 11. The method ofclaim 10, wherein the Cas endonuclease is Cas9.
 12. The method of claim8, wherein the nick is created by a Cas endonuclease that has beenmutated to render it incapable of creating a double-strand break butcapable of creating a single-strand nick.
 13. The method of claim 8,wherein the nick in step (b) is created within 125 basepairs of thedouble-strand break of (a).
 14. The method of claim 1, 6, or 8, furthercomprising providing to the target polynucleotide a heterologouspolynucleotide comprising a DNA sequence flanked by polynucleotidessharing homology to sequences flanking the target site.
 15. The methodof claim 14, wherein the DNA sequence is further flanked bypolynucleotides sharing homology to the target site sequence.
 16. Themethod of claim 14, wherein the DNA sequence is further flanked by twopolynucleotides, wherein each of the two polynucleotides shares homologyto a mutation created by a repair of the double-strand break.
 17. Themethod of claim 16, wherein each of the two polynucleotides shareshomology to the first most prevalent mutation created by a repair of thedouble-strand break.
 18. The method of claim 16, wherein each of the twopolynucleotides shares homology to the second most prevalent mutationcreated by a repair of the double-strand break.
 19. The method of claim14, further comprising providing to the target polynucleotide aplurality of heterologous polynucleotides, each of which comprise a DNAsequence flanked by two polynucleotides that each share homology to twoor more of the following: (a) the target site sequence, (b) the firstmost prevalent mutation created by a repair of the double-strand break,and (c) the second most prevalent mutation created by a repair of thedouble-strand break.
 20. The method of claim 1, 6, or 8, wherein atleast one Cas endonuclease is provided as a protein molecule.
 21. Themethod of claim 1, 6, or 8, wherein at least one guide RNA is providedas an mRNA molecule.
 22. The method of claim 1, 6, or 8, furthercomprising introducing a heterologous polynucleotide after step (a). 23.The method of claim 14, wherein the heterologous polynucleotide is atemplate for repair of the double-strand break.
 24. The method of claim14, wherein the heterologous polynucleotide is a donor DNA molecule forinsertion at the double-strand break site.
 25. The method of claim 1, 6,or 8, wherein the first double-strand break is repaired by NHEJ.
 26. Themethod of claim 1, 6, or 8, wherein the frequency of HDR of subsequentdouble-strand-break repairs is greater than the rate of HDR of the firstdouble-strand-break repair.
 27. The method of claim 1, 6, or 8, whereinthe frequency of HDR of a double-strand-break repair at a targetpolynucleotide site is greater than the rate of NHEJ at that same site.28. The method of claim 1 or 6, wherein the ratio of HDR to NHEJincreases in at least one subsequent DSB repair step as compared to thefirst DSB repair step with no nicks introduced.
 29. The method of claim1, 6, or 8, wherein the cumulative frequency of HDR is greater than thatobserved for single cleavage.
 30. The method of claim 1, 6, or 8,wherein the cumulative percentage of HDR is at least 10 times greaterthan that observed for single cleavage.
 31. The method of claim 1, 6, or8, wherein the fraction of HR reads relative to the number of totalmutant reads (NHEJ+HR) is at least 10 times greater than that observedfor single cleavage.
 32. The method of claim 1, 6, or 8, wherein thepercent of HR reads relative to the number of total mutant reads(NHEJ+HR) is at least 3%.
 33. The method of claim 1, 6, or 8, whereinthe first double-strand break is created by a Cas endonuclease and guideRNA complex.
 34. The method of claim 1, 6, or 8, wherein the method isperformed in a plant cell.
 35. The method of claim 34, wherein the plantcell is obtained or derived from a plant selected from the groupconsisting of: maize, rice, sorghum, rye, barley, wheat, millet, oats,sugarcane, turfgrass, switchgrass, soybean, Brassica, alfalfa,sunflower, cotton, tobacco, peanut, potato, Arabidopsis, vegetable, andsafflower.
 36. The method of claim 34, further comprising obtaining aplant tissue, plant part or whole plant from the plant cell, wherein theplant tissue, plant part or whole plant retains the at least onenucleotide modification.
 37. A method of introducing a plurality oftargeted recurrent double strand breaks at a target site in a genome,the method comprising: (a) providing a Cas endonuclease and a pluralityof guide RNA molecules, wherein the plurality of guide RNA molecules aredesigned to recognize a plurality of modified sequences at the targetsite; (b) introducing a double strand break at the target site by theCas endonuclease and a first guide RNA molecule to result in a firstmodified target sequence at the target site; and (c) introducing arecurrent double strand break at the target site by the Cas endonucleaseand a second guide RNA molecule, wherein the second guide RNA moleculebinds to the first modified target sequence at the target site;
 38. Themethod of claim 37, wherein the recurrent introduction of double strandbreaks at the target site promotes homologous recombination.
 39. Themethod of claim 37, wherein the recurrent introduction of double strandbreaks at the target site results in an increased frequency of cleavedDNA to increase the frequency of homologous recombination orhomology-directed repair.
 40. The method of claim 37, wherein theplurality of guide RNAs are provided exogenously.